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I. INTRODUCTIOM 



A. BACKGROUND 

The department of defease for the last twenty to thirty 
years has become more and more reliant on automatic data 
processing equipment to accomplish its seemingly ever 
increasing and complex mission. When this trend started, 
hardware was the overriding concern, consuming, in 1955, 
more than 80 percent of the data processing dollar [ 1 ]. 
Through the years, technical inovations, such as the evolu- 
tion from vacumm tubes to discrere transistors and from 
discrete transistors to integrated circuits, coupled with 
the increased use of mass production have decreased the cost 
of hardware. However, software has continued to rise in 
price. This rise in the price of software and the decrease 
in the price of hardware has resulted in software rapidly 
becoming the more costly of the two, and it is predicted 
that by 1985 it will account for better than 90 percent of 
the data processing dollar [2]. 

The true impact of this development may not appear to be 
significant until one realizes that the value of this soft- 
ware in 1973 was set at 20 billion dollars for the United 
States [3], and is estimated to be over 200 billion dollars 
in 1985 [4]. 

As a direct result of the monetary value of software 
production, many techniques have been developed to estimate, 
at the start, what the overall life cycle cost of a software 
project will be. A recent study conducted by Hughes 
Aircraft Company for the Air Force examined twenty-one of 
these models to determine commonalities and differences in 
their cost estimating approaches. Ten of these models are 
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while eleven have 



limited to software development cost, 
software support cost as a primary or secondary output. 
Table I lists all of the models studied, in alphabetical 
order .[ 5 ] 

Originally, it was thought that development costs were 
the most important item to derive and/or estimate. In fact, 
the development and design efforts for a new system are 
indeed still looked upon as more enjoyable and rewarding 
than the maintenance effort for an existing system. There 
are, of course, many reasons for this view. Six of these 
reasons, according to Robert Glass, are : 

1. Maintenance is intellectually very difficult. 
Problems cannot be bounded. The cause could be 
anywhere. 

2. Maintenance is technically very difficult. Problems 
cannot be specialized. They could surface because of 
errors in the coding, design, architecture, or 
concept. 

3. Maintenance is unfair. Usually the person who is main- 
taining a product did not write it and must interpret 
what the original author mean-^. Documentation is 
inadequate most of the time. 

4. Maintenance is no - win. People only come to mainte- 
nance with problems. 

5. Maintenance is infamous. There is very little glory, 
noticeable progress, or chance for ’success*. 

6. Maintenance lives in the past. The general quality of 
code being maintained is often terrible. This is 
partly because it was created when everybody's under- 
standing of software was more rudimentary, and partly 
because a great deal of code is produced by people 
before they become really good at programming. [ 6 ] 
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However, more and more research is being conducted on 
the maintenance aspect of software cost estimation. The 
reason for this is becoming apparent, as it has been esti- 
mated that from forty percent to ninety-five percent of life 
cycle costs can be attributed to the maintenance effort [7], 
The reason for this wide range of estimation seems to lie in 
the way various organizations view what constitutes 
maintenance . 

The definition of software maintenance appears to vary 
with the organization and seems to be effected by management 
constraints. Software maintenance can cover the spectrum 
from correction of bugs caused by coding errors and design 
inadequacies to enhancements whose purpose is to add whole 
new ideas and/or design concepts not specified for inclusion 
in the original system. The lack of a standard definition 
for maintenance is a major contributor to the paucity of 
data collection in this area. In many organizations, espe- 
cially military, as top level management personnel rotate 
through specific positions, different definitions of what 
constitutes software maintenance also rotate through these 
positions and the organizational levels they control. As a 
direct result, data collection requirements change to 
complement the definition of maintenance and, as a conse- 
quence, no consistent track of a project's manpower usage 
history can be recreated. Of greater significance is the 
lack of a standard maintenance policy within the organiza- 
tion to include a maintenance straregy which will add to the 
degree of software maintainability, if not assure it. 

In view of the large costs associated with software 
maintenance, GAO conducted a study which reviewed fifteen 
Federal computer installations in detail. Their findings 
pointed to two major contributors to the problem; the fact 
that, in the majority of agencies, maintenance is not 
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managed as a separate, identifiable function, and there is 
an absence of a uniform definition of maintenance [8], 
GAO*s recommendations included development of a standard 
definition of maintenance by the National Bureau of 
Standards and delineation of maintenance as a discrete func- 
tion by agency heads. In the interim, GAO developed a check- 
list of items, the consideration of which could reduce 
maintenance costs. In the checklist is a set of categories 
for recording maintenance costs. These six categories appear 
to reflect GAO's definition of maintenance and as such, are 
listed below: 

1. Modify or enhance software to make it do things for 
the end user that that were not requested in the orig- 
inal system design. 

2. Modify or enhance software to make it do things for 
the end users that were called for in the original 
design but which were not present in the first produc- 
tion version of the software. 

3. Remove defects in which the software does something 
other than what the user wanted ("does the wrong 
things") . 

4. Remove defects in which the software is programmed 
incorrectly ("does the desired calculation, but gives 
an incorrect answer") . 

5. Optimize the software to reduce the machine costs of 
running it, leaving the user results unchanged. 

6. Make miscellaneous modifications, such as those needed 

to interface with new releases of operating 

systems. C 9 ] 

This "definition" appears to have general applicability over 
the broad spectrum of activities which can be and have been 
grouped under the category of software maintenance. However, 
number one may cause problems in the context of maintenance 
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cost estimation techniques based on the Rayleigh curve. 
Since enhancements necessarily require some design/develop- 
ment effort by their very nature {they give the product 
capabilities not called for in the original design ) , the 
manning level in such effort would exhibit a rise and then a 
fall in magnitude in the Rayleigh fashion, thus creating a 
series of small Rayleigh curves within the maintenance 
phase. As long as this behavior did not vary greatly from 
the normal maintenance effort for that project, it would not 
have much effect on the project. However, if the front end 
of the curve rose beyond some predefined maintenance support 
boundary, then it would indicate the presence of a full 
scale development project instead of a pure maintenance 
effort, and it should signal the completion of the old 
project and the start of a new one. Therefore, because of 
the nature of the software life cycle, even a standard defi- 
nition of maintennace has grey areas and management judge- 
ment must be used in its application. 

The GAO definition does, as stated earlier, provide a 
good, general definition of software maintenance and, as 
such, for the purposes of this thesis, software maintenance 
encompasses all of its categories. 

B. PROBLEM DEFINITION 

James F. Green and Brenda F, Selby, formerly of the 
Naval Postgraduate School, having reviewed Putnam’s Software 
Cost Estimating Model, the Army Macro-estimating Model, the 
Leh man-Belady Model, and the Parr Model, have proposed a 
dual theory for maintenance requirements estimation. They 
proposed that, if one considered maintenance to include all 
effort applied to a software project from the time that the 
product was released to the user, that the peak maintenance 
manloading required could be calculated by computing the 
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inflection point on a Rayleigh curve for the total software 
life cycle effort. They further predicted that one could 
predict the ndnimum maintenance manloading requirments by 
computing the inflection point on the Rayleigh curve repre- 
senting the maintenance life cycle. 

The proposed Green/Selby Model, upon cursory examina- 
tion, appears to have tremendous potential as a tool for the 
manager of software projects. However, Green and Selby were 
not able to obtain sufficient data to thoroughly validate 
the applicability of the model to real world situations. 
Therefore, much further work is needed in this area. 

C. RESEARCH OBJECTIVES 

The objectives of the research are twofold: to evaluate 

the Green/Selby model for prediction of maintenance costs 
via projection of maintenance manloading, both for mainte- 
nance team development and for out year support resource 
estimation, and to provide an analysis of applications of 
the model in areas other than project management and 
control. The Green/Selby model addresses two areas, a main- 
tenance planning concept which is concerned with the overall 
maintenance strategy as applied to a particular software 
project and a maintenance control concept which is concerned 
with manloading requirements estimation. Only the latter 
will be dealt with in this research. 

The evaluation of the model will be accomplished in the 
pursuit of three subobjectives. The first is to provide an 
analysis of software maintenance costing problems and a 
synopsis from the literature of other existing models and 
techniques, some of which were used in the initial 
Green/Selby model development, and some of which the authors 
feel are of equal importance and which may contribute to 
further development or application of the Green/Selby model. 
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The second subobjective is to validate the development of 
the Green/Selby model through analysis of the mathematical 
relationships and through recreation of the empirical devel- 
opment. The third subobjective is to validate the model with 
actual data from as many different sized software projects 
as possible to ascertain the degree to which the model is 
applicable to real world software costing problems. 

Based on the results of the data analysis, projections 
will be made as to possible applications of the model in 
areas other than cost estimation, if such applications 
appear to exist. 

D. ASSUHPTIONS/LIMITATIONS 

Three major assumptions were made at the onset of the 
research effort for this thesis. Other assumptions were 
necessary at specific junctures of the research but rhey do 
not apply in every case, so they are discussed where they 
are applicable. The major assumptions are as follows: 

1. It was assumed, based on limited prior study in the 
subject area, that the software project life cycle and 
all of its phases followed the general pattern of the 
Rayleigh curve, 

2. It was assumed that the Green/Selby Model was valid in 
its development though not thoroughly tested in its 
application. 

3. It was assumed that there is little difference in how 
project size affects the manning behavior of a project 
during the individual phase cycles and during the 
total project life cycle. 

Three major constraints were found to limit the research 
effort. They are as follows: 

1 . There was found to be a serious lack of readily avail- 
able data which applied to the maintenance phase. 
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2 . 



There appears to have been little major research done 
in the area of software maintenance manloading/cost 
estimat ion. 

3. Because of the nature of the subject area and the 
variance of maintenance data collection across organi- 
zations, the research completed and data collected to 
date appears to have involved what are recently being 
categorized as inefficient and maintenance-intensive 
design techniques. Therefore, the applicability of 
early works and present research using old data may 
become suspect, if not invalid, by the use of such 
techniques as modularization, information hiding 
modules, and the use of other, recently developed, 
software tools. Hence, the new methods may alter the 
old relationships entirely. 

E. HESEARCH SETHODOLOGY 

The research methodology implemented by the authors of 
this thesis was fivefold, to include literature search, data 
search/collection, research design, model validation, and 
data analysis/evaluation. 

A literature search was conducted both by manual and 
automated means. A manual search produced most of the refer- 
ences, used by Green and Selby, which were used to provide 
the researchers with a solid background in the area of study 
and to recreate, as closely as possible, the knowledge base 
from which the Green/Selby model was developed. Two auto- 
mated searches were conducted, one through the Defense 
Logistics Information Studies Ezchan ge (DLSIE) and one via 
the computerized library search network. Both searches 
produced numerous writings of interest from the private and 
military sectors. 
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The search for data highlighted the largest single stum- 
bling block to research in the area of software maintenance, 
that of a lack of adequate data collection by maintaining 
activities. Actual manloading records have usually been 
kept during the development phases of numerous software 
projects; however, maintenance data appears to have been 
recorded only recently, and then only sporadically at best. 
The search for data was conducted successfully via telephone 
conversations with the following persons/organizations; 
Goddard Space Flight Center, Greenbelt, Md.; and 
Dr. Willa Kay Wein er-Ehr lich, consultant. Bankers Trust 
Co. , NY, NY. 

The following organizations were contacted in the course of 
the search with no significant results; 

Data And Analysis Center for Software, Griffis AFB, NY; 
United States Array Computer Systems Command, Ft. Belvoir, 
7 a- ; 

Aeronautical Systems Division, Wright Patterson AFB, 
Dayton, Ohio; and 

Data Systems Design Center, Gunter AFSTA, Montgomery, Ala. 
Valuable support and/or referral information were received 
from the following persons; 

Dr. Robert Grafton, Office of Naval Research, Washington, 
D . C . ; 

Dr. Victor Bascili, University of Maryland, College Park, 
Hd. ; 

Mr. David Weiss, Naval Research Laboratory, Washington, 
D.C. ; 

Ms. Cheryl Maloney and Mr. Robert Jones, United States 
Army Computer Systems Command, Ft. Belvoir, 7a.; and 
Mr. Lawrence Putnam, Suantitative Software Management, 
Inc., McLean, Va. 



20 



The NASA SEL data base, which contains data on about 
forty software projects, was received from the Data and 
Analysis Center for Software, but it was discovered that 
maintenance data is just now being collected, and no signif- 
icant aggregate will be available for approximately two 
years, 

A report, produced for the Air Force by General Research 
Corporation of Santa Barbara, Ca. , indicated that the 
Planning and Resource Management Information System (PARMIS) 
at the Air Force Data Systems Design Center (AFDSDC) , Gunter 
AFSTA, Montgomery, Ala., held a large, relatively untapped, 
data base of manpower usage (projected and actual) from 
about 2000 projects. However, the data search revealed that 
PARMIS was replaced by a new Personnel Cost/ Accounting 
System in 1977/1978 and it appears that the former data base 
was deleted due to format incompatibilities wirh the new 
system. 

As such, it is apparent that little maintenance data is 
available or, if in existence, it is very difficult to 
locate. 

Once a knowledge base was developed and data collected, 
the research process was begun. That process is listed in 
general: 

A. Develop mathematical relationships in terms of equa- 
tions; 

3. Validate Green/Selby model development; 

C. Analyze empirical project dara in terms of Green/Selby 
model; and 

D. Interpret data analysis. 

In order to attempt to validate the Green/Selby model, 
the model development was recreated as closely as possible 
using the same or similar data. 
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Data analysis was conducted by using various non-linear 
curve fitting techniques to fit actual life cycle man- 
loading values to the Rayleigh model. Then, Green/Selby 
model relationships were calculated and plotted against 
maintenance phase values. The above techniques allowed eval- 
uation of applicability of the Green/Selby model with actual 
project data. 

F. OVERVIEW OF THE THESIS 

In this introductory chapter, the term software ‘mainte- 
nance' was defined and its importance in the context of the 
data systems organization was discussed. The problem to be 
considered in this thesis has been presented and the objec- 
tives of the research effort intended to resolve the problem 
have been delineated. Assumptions made at the onset of the 
research effort and major limitations encountered during the 
course of the research were discussed. Finally, the research 
methodology was outlined. Chapter II looks at various 
models and cost estimating techniques which were used as a 
basis for the development of the Green/Selby model. It also 
includes a synopsis of other models which the researchers 
feel are of importance to the particular area of study. 
Chapter III presents an in-depth analysis of the Green/Selby 
model, and its proposed applications. Chapter IV provides a 
mathematical and empirical validation of the model, using 
similar data to that used by Green and Selby originally. 
Chapter V discusses the data analysis, and thus, the empir- 
ical model validation evaluation. Finally, Chapter VI summa- 
rizes the thesis ana presents conclusions and 
recommendations. 
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II 



SOFTWARE MAINTENANCE COST ESTIMATION MODELS 



A. CURRENT TECHNIQUES USED AS A BASIS FOR THE 3REEN/SELBY 
MOD EL 

1 . Putnam * s So ft war e C os t Estimating Mod el 

Putnam developed his method for software cost esti- 
mation by studing various systems designed by the United 
States Army Computer Systems Command (USACSC) and comparing 
them to the Rayleigh life cycle profile developed by Peter 
V. Norden in the 1960 ’s. This life cycle profile, depicted 
in Figure (2. 1) , linked the individual cycles of each of the 
life cycle phases and added them together producing the 
profile for the entire project. Putnam's empirical studies 
showed that, for the system studied, the software life cycle 




Figure 2.1 Rayleigh Project Life Cycle Profile 
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exhibits a rise in manpower up to a peak and then a trailing 
off portion corresponding very well with Norden's Rayleigh 
curve. 



Putnam attempts to answer the questions "How do I 
know how long a software project will take, and how much 
will it cost**? [10] In order to do this, Putnam analyzes 
the following areas: 

•Optimum Man-loading over life cycle 
•Total Manpower over life cycle 
•Cost per year 
•Life Cycle cost in 
•Current $ 

• Inflated $ 

• Discounted $ (for S. A.) 

•minimum $ benefits to break even over economic life 
•Risk profiles for: 

•Manpower 

•Costs 

•Project completion [11] 

The Rayleigh model for cumulative manpower utiliza- 
tion, used by Putnam, is given by the formula 



2 

-at 

Y = K(1-e) 



( 2 - 1 ) 



where 



Y = cumulative manpower used, 

K = the total number of man-years of life cycle 
effort, 

a = the curve shape parameter, and 
t = the elapsed time in years. 

However, the most popular form of the curve is the deriva- 
tive form for current manpower utilization expressed by 

2 

-at 

Y* = 2Kate . ( 2 , 2 ) 

Empirically derived: 
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2 



9 



(2.3) 



a = 1/2t 

d 



where 

t = the time to reach peak effort, 
d 

In terms of software projects, t has been empirically shown 

d 

to correspond very closely to the design time (or the time 
to reach initial operational capability) of a large software 
pro ject C 2 ]. 

With t^ representing the development time for the 
system, equation (2.3) can be substituted into the Rayleigh 
equation, and the shape of the curve, together with the 
accompanying equation, allow us to project what the manpower 
requirements and cash flow for system development will be at 
any given time. (Cash flow is calculated by multiplying 
manpower projections by the current personnel salaries.) 
The equation representing this curve is[13] 



2 2 
2 -(t /2t ) 

Y* = K/t te. d 

d 



(2.4) 



Putnam found that there was a fundamental relation- 
ship in software development between the number of source 
statements in the system and the effort, development time, 
and the state of technology being applied to the project. 
The equation that describes this relationship is : 



Ss 



1/3 

Ck K 



4/3 

d 



(2.5) 



where 

Ss = the number of end product source lines of code 
delivered, 

K = the life cycle effort in man-years, 

t = develooment time, and 
d 

Ck = a state of the technology constant. 
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At least three different estimates of program size 
should be made before development of the system begins. 
They should be made once during the system definition phase 
and at least twice during the functional design and specifi- 
cation phase. This will insure a very realistic estimate of 
the size of the system. Admittedly, estimation of Ss and Ck 
are extremely difficult; however, if similar projects have 
been done in the past their values should remain fairly 
constant, [ 14] 

Putnam's model seems to work extremely well with 
large scale software projects but it does not seem to fit 
well for projects under 10,000 lines of source code [15]. 
The largest problem with the use of Putnam's model is the 
reliance on past experience and historical data banks, if in 
fact they exist, to estimate the size and complexity of the 
current project. It also pays little attention to operation 
and maintenance costs after development is complete or non- 
manpower related items such as computer time and travel 
allowances which may influence total life cycle costs to a 
great extent. 



2. Parr's Software Cost Es timatino Model 



The Parr model was developed by F. N. Parr after he 
had studied the work done by Norden and Putnam on the 
Rayleigh curve. Parr was concerned that the Rayleigh curve 
failed to answer questions about the learning curves usually 
associated with the start of new projects. He also felt 
that it made the assumption that the skill available for a 
pro jeer depends on resources which have been applied to it. 
This, he states, confuses the intrinsic constraints of the 
linear learning curve with the rate at which software can be 
written, based on management's economically governed choices 
in response to these constraints, Parr further states that: 
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The orocess generally used to develop new software can 
be thought of as the successive solution of a large number 
of small problems. The solution of each of these indi- 
vidual problems is a decision which defines some feature 
of the final program, k development project corresponds 
to starting out with some fixed bounded set of problems to 
be solved and ending with enough decisions having been 
made for a working product to be available. [ 16 ] 

Parr utilized a binary tree concept to statistically 
determine the number of possible problems and decided that 
the proportion of the problems solved at time t, denoted as 
W(t), was given by the formula 



-at 

W(t) = 1/(1 + A e ) , (2.6) 



where 

A = a constant, and 
a - shape parameter. 

By solving this equation, he could determine the 
expected change in the size of the visible unsolved node set 
as a linear function of the work completed. The importance 
of this was that he determined that the rate at which work 
could be usefully input to the development process was 
proportional to the size of the set of visible unsolved 
problems, V (t) . He further determined that when the optimal 
input effort is applied, steps in the development would be 
achieved at a rate proportional to V(t). Thus the work-rate 
could be determined by solving for V (t) which he developed 
into the equation : 

2 

V(t) = (1/U) sech ((at + c3)/2), (2.7) 



w here 

c3 = an inregration constant. 

Figure (2.2) shows the resulting curve overlayed on a corre- 
sponding Rayleigh curve. 

It can be seen that the back portion of the sech- 
squared function correlates very highly with the Rayleigh 
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Figure 2.2 Coaparison of Sech* and Rayleigh Curves 

curve. However, the front portion does not show a well-de- 
fined starting point, as is the case with the Rayleigh 
curve. Parr feels that the front portion of the curve 
represents that portion of the work done before the official 
starting date for a project. He feels that this is more 
realistic than the Rayleigh curve. 

Parr went on to explore the complexity factors 
introduced by the increased usage of structured programming 
and developed the formula: 

3/2 

-2at -2at 

7(t) = [aAe / (1 ♦ Ae ) ]/a. (2.8) 

The resulting curve has its peak shifted slightly to 
rhe right of the sech-sguared function; which predicts that 
peak work rate will occur after half the project has been 
done. This he asserts is in keeping with the theory that 
design may be slower, but there will be a compensating 
reduction in testing and maintenance effort. 



3 . Armj Macro- estiaat icQ Mo del 

Having already developed a number of software 
systems, the Army decided that it needed a method which 
would be simple, effective, and reasonably accurate for 
determining and controlling manpower and dollar resources 
for any point in the software life cycle. 

After reviewing the data on its existing systems, 
the Army chose the mathematical relationship developed by 
Norden where; 

2 

-at 

Y» = 2Kate. (2.9) 

This equation was the same one used by Putnam, and it was 
used by the Army to derive the various milestones to be used 
by system managers. By comparing the actual resources used 
when these milestones were reached, the action officer could 
take corrective action if, statistically, those resources 
used were outside the control limits. 

These milestones were developed based on step-by- 
step procedures given in the following cases: 

Cas e I: ^§tem a lread y und er develo pment ( re sources 

budgete d) . 

Dsing budget data, the maximum level of manpower 

(Y* ) and the number of years to reach maximum effort 

ma X 

(t ) is determined. Rather than compute the values for 

Y ‘max 

out year manpower loading. Table II is used to compute the 

values of Y’ for the appropriate t . Bv multiplying any 

Y * max 

entry opposite its time period by K, the appropriate number 
of manyears are obtained. The units of K and t will deter- 
mine the dimensions. 

Case II ; New s yste m ( no resource data) . 

Total man-years of effort and peak time for manpower 
loading is derived using Bayes' theorem. Based on empirical 
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lABLE II 



Ordinates for Manpower Functions 



t It 1 1 2 3 4 5 6 

j I*maxj 



0| a 


|.50 . 1250 .0 556 


.0310 .0200 .0139 


1| 


.60653 . 22062 . 1 0510 


.06057 .03920 .02739 


2| 


.27067 . 30326 . 17794 


.11031 .07384 .05255 


3! 


.03332 .24349 .20217 


.14153 .10023 .07354 


4| 


.00134 . 13533 . 1 8271 


.15163 .11618 .08897 


5| 


.00001 .05492 .13852 


.14307 .12130 .09814 


6| 


.01666 .09022 


. 12174 . 1 1682 . 1 01 08 


7! 


.00382 .05112 


.09461 . 10508 .09845 


8| 


.00 067 .0 2539 


.06766 .08897 .09135 


9! 


.00009 .0 1110 


.04475 .07124 .08116 


10| 


.00000 .00429 


.02746 .05413 .06926 


111 


. 00 000 .0 0147 


.01567 .03912 .05691 


12! 


.00044 


.00833 .02694 .04511 


131 


.00012 


.00413 .01770 .03453 


14| 


.0 0002 


.00191 .01111 .02556 


151 


. 00000 


.00082 .00666 .08130 


16| 


.00000 


.00033 .00382 .01269 


17| 




.00012 .00210 .00853 


181 




.00004 .00110 .00555 


191 




.00001 .00055 .00350 


201 




.00000 .00026 .00214 


data from 


1 internal systems, a 


probability versus K 


funct ion 


was derived without 


regard to type of 


Further 


analysis determined f 


requency of system t 



7 



.0120 I 

.020201 
. 039 18| 
.055851 
.069331 
.079061 
. 08480! 
. 086641 
. 084971 
.080361 
.073561 
. 065301 
.056341 
.047291 
.038661 
.030811 
.023951 
.018171 
. 01346! 
. 009741 
.00689! 

density 
system, 
ype and 
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probability of occurence of each type. Using estimates 
based on past DSACSC experiences (the average K value for 
all systems under development and average K for the func- 
tional type of system) , initial estimates for a new develop- 
ment are calculated from regression graphs. Then, applying 
Bayes' theorem to average these individual estimates in the 
weighted probability sense yields a better estimate of K 
with a smaller standard deviation (i.e. better confidence in 
the estimate) . To improve estimates and reduce uncertainty, 
Bayes' theorem is successively applied. [17] 

Lehman-Be lad y M odel 

L. A. Belady and M. M. Lehman developed their model 
by studing the management and evolution of the OS/360 oper- 
ating system. They felt that this system gave them a good 
view of the processes and managerial thinking that goes into 
the development and programming of medium to large-sized 
projects- The decision to use this system was reached after 
they had surveyed a number of versions and releases of 
OS/360 before their study began. The data for each release 
included measures of the size of the system, the number of 
modules added, or changed, the release date, information on 
manpower used, machine time used and costs involved in each 
release. In general, there were large, apparently 
stochastic, variations in the individual data items from 
release to release. 

The data exhibited a general upward trend in the 
size, complexity, and cost of the system and the maintenance 
process. This was indicated by comparing the components, 
statements, instructions, and modules handled over the 
system life cycle. The various parameters were averaged to 
expose trends. When averaged, previously erratic data 
appeared to become strikingly smooth, displaying nonlinear - 
possibly exponential - growth and complexity. 
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as a result of their research, they postulated three 



laws of Program Evolution Dynamics. 



I. Law of continuing change. A system that is used 
undergoes continuing change until it is judged more cost 
effective to freeze and recreate it. 

Software does not face the physical decay problems 
that hardware faces. But the power and logical flexi- 
bility of computing systems, the extending technology of 
computer applications, the ever-evolving hardware, and the 
pressures tor the exploitation of new business opportuni- 
ties all make demands. Manufacturers, therefore, 

encourage the continuous adaptation of programs to keep in 
step with increasing skill, insight, ambition, and oppor- 
tunity. In addition to such external oressures for 

change, there is the constant need to repair system 

faults, whether they are errors that stem from faulty 
implementation or defects that relate to weaknesses in 
design or behavior. Thus, a programmina system undergoes 
continuous maintenance and development, driven by mutually 
stimulating changes in system capability and environmental 
usage. In fact, the evolution pattern of a large program 
is similar to that of any other complex svstem in that it 
stems from the closed-loop cyclic adaptation of environ- 
ment to svstem changes and vice versa. 

As a'system is changed, its srructure inevitably 
degenerates. The resulting system complexity and reduc- 
tion of managerability are expressed by the Second Law of 
Program Evolution Dynamics. 

II. Law of increasing entropy. The entropy of a 
system (its unstructuredn ess) increases with time, unless 
specific work is executed to maintain or reduce it. 

This law too expresses vast experience, in part bv 
data. ..This, in turn, leads to the formulation of the 
Third Law or Program Evolution Dynamics. 

III. Law of statistically smooth growth. Growth 
trend measures of global system attributes may appear to 
be stochastic locally in time and space, but, statisti- 
cally, they are cyclically self-regulating, with well-de- 
fined long-range trends. 

The system and the metasystem -the project organiza- 
tion that is developing it- constitute an organism that is 
constrained by conservation laws. These laws may be 
locally violated, but they direct, constrain, control, and 
thereby regulate and smooth, the long-term growth and 
development natterns and rates. Observation, measurement, 
and interpretation of the latter can thus be used to plan, 
control, and forecast better the product of an existing 
process and to improve the process so as to obtain desired 
or desirable characterist ics. [ 18 ] 



Having postulated these three laws, 
the process of defining a complexity factor 
various program releases, each of which 
Release Sequence Numbers (RSN's). Prom the 



they commenced 
C(5) for the 
were assigned 
available data 



they proposed the formula: 



C = MH / M , (2, 10) 

a R R 
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w here 

M (B) measures the size of the the system in 
modules and 

M (HR) records the number of system modules 
that have received attention. 

Utilizing this complexity factor, they stated that 
the design - programming - distribution usage system has a 
feedback driven and controlled transfer function and an 
input-output relationship. This feedback results, some- 
times, from constant pressure to supplement system capa- 
bility and power. This constant pressure normally results 
in work pressures building up as growth rats increases, 
accordingly, the growth rate increases the size and 
complexity of the system and reduces the quality of design, 
coding, and testing. This is accompanied by lagging docu- 
mentation, and other factors, which emerge to counter the 
increasing growth rate. 

Eventually, the above relationship resulted in the 
need for a system consolidation in which correction, 
restructuring, and rewriting were done with few, if any, 
functional enhancements. The consolidation often results in 
the shrinking of a system during such a release, rather than 
the growing normally experienced with each new release. 
This, they observed, occurred with every twenty to twenty- 
one releases of the system. They further observed that 
successful releases appeared to have an upper bound of about 
400 modules. 

Since the majority of managers base their decisions 
on available budgets, Lehman and Belady proposed that the 
total expenditure for all activities involved winh the 
project be equal to the budget, and hence, the formula for 
the budget (3) is given by: 

B = ? + A + C (2.11) 
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where 

P is units of fault extraction activity 
termed progressive, 

A is the amount of resources associated with 
documentation, administration, communication, 
and learning activity termed antiregressive. 

C is the increasing work demanded to cope with 
the neglect of A, and is given by the formula 

C = r^(l-m) kPdt, and 



( 2 . 12 ) 



w here 

m and k are defined below. 

The formula for antiregressive activities is: 

A = mkP (2, 13) 

w here 

m is the management factor, which is the 
fraction of progress, kP, that is actually 
dedicated by management to A activity, and 
k represents the inherent A activity required 
for each unit of P activity so that complexity 
does not grow and is given by the formula 



k = A / P. (2. 14) 

Management is assumed to have full control of the 
allocation of its resources and the division of effort 
between P- and A-type activities. Management cannot, 
however, directly control t’ ' “ 

accumulates, except by utter 

control through restructuring. This is 
strictly antiregressive and, as such, 
difficult to inspire, since it yields 
term, benefits. [19] 



rol the growth in complexity that 
concentration pn . complexity 



an activity that is 
is psycholoaicallv 
no direct, 'short- 



An interpretation of their model suggests that more 
rapid work leads to greater pressures on the team, and hence 
more errors. This, in turn, requires greater repair 
activity. However, the data indicates that this problem is 
mainly incurred in the same release rather than discovered 
and undertaken thereafter. Futhermore, since it appears to 
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lead to an increase in the fraction of the system handled, 
it suggests that the maintenance teams tend to remove the 
symptoms of a fault rather than to locate and repair its 
cause. This problem is reduced through proper communica- 
tion, documentation, and learning by the programming 
team. [20 ] 



B. OTHEB MODELS OF INTEREST 



1 . Jensen Model 

Randall W. Jensen [21] stated that, because tradi- 
tional intuitive estimation methods consistently produce 
optimistic results which contribute to the too familiar cost 
overrun and schedule slippage, customers for software prod- 
ucts are becoming less willing to tolerate the losses asso- 
ciated with inaccurate estimates. He, therefore, derived 
his model based primarily on the work done by Norden, 
Putnam, and Doty Associates. 

In conjunction with the familiar Rayleigh equation 

2 

-a t 

Y’ = 2Kate, (2. 15) 



Jensen's model consists of a series of equations for system 
productivity, initial project staffing rate, system 
complexity, system size, development effort, and risk 
analysis. 

He defines the productivity relationship by the 
equation; 



-3 

— 2 
PR = C (K/t ) , 
n d 



(2.16) 



where 

PR = average project productivity (source 
lines per year) , 

K = Total life cycle cost in man years. 
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= development time in years and is defined 
as the peak time for the Rayleigh curve, 

C = a proportionality constant, and 
n 

B = slope of productivity relationship. 

while this equation is not actually related to the 
system difficulty, it is related to the rate at which staff 
is applied to the task. Intuitively, productivity is an 
inverse function of the number of people directly involved 
with a development task due to the associated losses caused 
by the number of ccm munica tion paths in the organization. 
This phenomenom can be accounted for by utilizing the 
relationship 

2 

a = K/t , (2. 17) 

d 

which is the formula for the initial project staffing rate, 
M, and is extremely important in determining the optimum 
project staffing rate. 

Most, if not all, of the projects studied by Jensen, 
appeared to demonstrate a consistent pattern which could be 
used to classify each project into distinct categories. 
These categories were dependent on the interface complexity, 
logical complexity, and the percentage of new development in 
the system, all of which seemed to be defined by the ratio 

3 

K/t^. (2.13) 

The expression K/t^ , in a practical sense, represents 

d 

a natural equilibrium between the lifecycle cost and devel- 
opment time for a specific class of software projects. As a 
result, similar projects tended to maintain this equilibrium 
so that as the system size increased, the development 
schedule increased correspondingly. This equilibrium also 
maintained the staffing rate. 
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2 

K/t , 
d 



(2. 19) 



within bounds that could be effectively accommodated by the 
project. Thus, he used this equilibrium expression to 
define system complexity (D) as 




The value of D can be thought of as a limitina 
parameter in determining the minimum development time that 
an organization can achieve for a given software project. 
Table III shows the values of D determined by Jensen from 
Putnam's analysis of USACSC data. 

The next equation, developed by Jensen, was referred 
to as the software equation, relating the size of the system 
to the technology being applied by the developer in the 
implementation of the system. In deriving this equation, 
Jensen utilized an extension of the productivity relation- 
ship proposed by W. P. Sampson of General Electric Company. 

Sampson [22], after reviewing data supplied by 
Putnam from 19 USACSC projects, determined that only a 
subset of these projects represented a consistent develop- 
ment environment and were sufficiently documented to be of 
value in establishing the model parameters. Evaluation of 
this refined set of data obtained a 3 value of -0.50 for the 
basic relationship between productivity and project stress 
instead of the -0.667 obtained when all the data was used. 

With Sampson's wor h in mind, Jensen derived the 
software equation tc establish the rate of source code 
development, dSs/dt. In his development, he assumed that 
the portion of the pro jeer effort devoted to code produc- 
tion, PI (t) , was characterized by a Rayleigh curve, which 
was complete at td. 
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TABLE III 



Project Complexity Values 



Value 


C h aracterist ics 


8 


Applies to new systems with significant inter- 
face and interaction requirements within a lar- 
ger system structure. Operating svstem and real 
time processing developments with large percent- 
ages of logical code are typical of this class 
of systems. 


15 


Applies to new standalone systems developed on 
firm operating systems. The interface problem 
with the underlying operating system or other 
parts of the system is minimal. New applica- 
tions software is typical of this class of sys- 
tems. 


27 


Applies to complete rebuilds of existing stand- 
alone systems where major portions of existing 
logic can be used. 


55 


Applies to composite systems where existing sys- 
tems are combined or integrated with little or 
no modification of existing software. 



Then if 



t /t 1 = 6, 
d d 



( 2 . 21 ) 



where 



t 1 = the time of peak manloading on the Rayleigh 
d 



curve, coincidental to development time, and 
^ t 2 2 

d f d 2 - (3t /t ) 

> ? (t)dt = ) (K/t )te d dt = 0. 95K/6 , (2. 22) 

0 d 



then the burdening rate for this project is 

+• 



J 



? (t) dt 



PI (t) at 



0. 3934K 



0.95K/6 



= 2.49, 



(2.23) 
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w here 



P (t) = staffing level. The rate of source code devel- 

opment, dSs/dt, is assumed to be proportional to the rate of 
code production, PI (t) so that 

Ss ~ 2.49 PH PI (t) , 



and 

2 2 

— 2 - (3t /t ) 

Ss = 2.49 PS K/t^ J te d dt 

= 2.49PS K/6. 

Substituting the empirically derived value of PR = 



(2.24) 



-0.5 

C M 
1 



gives : 

.5 

Ss = (2.49C /6)K t 



or 

Ss = cyT t^ , (2. 25) 

which is the software equation where 

= a developer technology constant. 



This technology constant, Ct, is a factor, or 

constant of proportionality, that allows the user to relate 

the system size, Ss, the life cycle effort, K, and the 

development time, t , for any specified project. The 

d 

constant accounts for all variations in the life cycle 
effort for projects which have similar size and schedule 
properties. The constant is then a measure of the develop- 
er’s production technology, or ability to implement the 
project. This includes such factors as the availability of 
computing resources, organizational strategies, development 
tools and methodologies, familiarity with the target 
computer, etc. 
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The technology constant considers two aspects of 
production, the environmental aspect and the technical 
aspect. The environmental aspect includes those factors 
dealing with the basic computing environment. The environ- 
mental factors determine a technology constant which 
normally ranges between 2000 and 5000, with higher values 
characteristic of higher productivity environments; ie. , 
from primitive tools to dedicated advanced tools and 
resources. The technical aspects of the technology constant 
are accounted for through the use of adjustment factors 
applied to the basic technology constant by use of the 
formula 



C 

t 






14 

2 f = C /f , 
i= 1 i tb t 



(2. 26) 



where 



c 

tb 



basic technology constant. 



f 

i 

f 

t 



ith adjustment factor, and 
total adjustment factor. 



The adjustment factors include those effects which are 
beyond the basic development environment and are project 
specific. The factors, which are shown in Table IV, are 
examples of those found in a command and control system 
environment. 



Feeling that his model could be understood better as 
a linear programming problem presented in a graphical 
format, Jensen defined the additional formulas which he 
could use for this forum. The first formula was for the 
development effort (E) which he derived as: 






0.4K. 



(2.27) 
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TABLE IV 



Technology Constant Adjustment Factors 



-- 


Factor 1 


! Value 


Humber | 
1 


1 

Descrxption j 


i 

1 Yes 


No 


1 


Special display requirements 


1.11 


1.00 


2 


Detail operational requirements 


1.00 


1.54 


3 


Changes to operational require- 
ments 


1.05 


1.00 


4 


Seal time operation 


1.33 


1.00 


5 


CPO memory constraint 


1 . 25 


1.00 


6 


CPU time constraint 


1.51 


1.00 


7 


First software developed on CPO 


1 .92 


1.00 


8 


Concurrent ADP hardware 
development 


1.67 


1.00 


9 


Developer using computer at 
another facility 


1.43 


1.00 


10 


Development at operational site 


1. 39 


1.00 


1 1 


Development comouter different 
than target computer 


2. 22 


1.00 


12 


Development at multiple sites 


1 . 25 


1.00 


1 3 


First use of language 


1. 80 


1.00 


14 


MIL- STD documentation 


1.40 


1.00 



I I I 



The next was a relationship (S) determined by the system 
size and the developer’s approach to the project and was 
given by: 

3 = Ss/C^= \j^.^ . (2.23) 

Then, utilizing the formulas for h and D, equations (2.17) 
and (2.20) , where h represents a fixed staffing rate or 
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management stress curve, and D represents projects of fixed 
complexity, he could plot all these equations on a solution 
surface for various size projects as shown in Figure (2.3). 




Figure 2.3 Macro- Estimating Model. Solution Surface 
(Graphical Representation) 
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With respect to either effort or time, the optimum 
solution will be located at one of the vertices defined by 
the constraint lines. The possibility exists that, once all 
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/ 



the constraints, D, R, M, E, and t , are plotted on the 

d 

solution surface as shown in Figure (2.4), some of the 

constraints will be eliminated from futher analysis by the 

manner in which other constraints intersect to form the 
bounded region. If the constraints bound a null region, 
either the cost or schedule is too optimistic and cost or 
schedule overruns in software development are likely to 
occur. However, by utilizing the values for K and t 

d 




• 1 j 

Figure 2.4 Feasible Solution Region 



obtained from the graph and subs-ituting into the Rayleigh 
equation, the optimum staffing profile (YM can be obtained. 

Recognizing that the calculations made by the model 
assume that the input parameters are exactly known, and that 
there is a degree of uncertainty associated with each of the 
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input parameters, Jensen postulated, for risk analysis, that 
the deviation from the mean can be calculated using the 
relationship 

22 22 22 0.5 

of = [ (Of/aSs) a + (9f/aC ) o + (3f/8D) a ] , 

s t c D 



where 
f = 
or 

^ - 



t 

d 



= \5/(Ss/C ) 1/D, 



3 0.4 

K = C (Ss/C^) D] . 



(2.29) 



Similar expressions for f could be found by using M, 
instead of D, as the bounds for the feasible region. In 
cases where both M and D interact, the expression for f 
should be considered invalid and no alternative solution was 
pro vided.[ 24 ] 

As an example of this risk analysis technique he 

provided the example where Ss = 55,642; D = 15; s = 2,058; 

a = 1; and t = 0.482. The results were then nlotted as 
D 

shown in Figure (2.5). The results show that the 
probability of meeting the required schedule is 94 

percent. [ 25 ] 

2 • Ot her Models 

A description of some additional models which were 
not used in this thesis but the reader mighr find informa- 
tive are provided in Appendix A and Appendix 3, as described 
by R. Thibodeau and R. Wolverton, respectively [26,27]. 



C. CHAPTER THO SOMMARY 

The thesis of the models used in this chapter and in 
others that were found in the literature, was to try and 
give management a tool with which they could predict the 
cost of software, the time for producing this software, or 
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Figure 2.5 Risk Analysis of Schedule Osing Graphical 
Technique 



both. Most, if not all of the models require the use of 
historical data and/or management's previous experience as a 
portion of the predictive process. 

It was Putnam's view that software production followed a 
Rayleigh curve. This curve, he asserted could be calculated 
utilizing historical data to determine the technology 
constant (Ck) , and the estimate of source lines of code for 
this type of project (Ss ) , plus the budgeting information 
for the total number of man-years for the systems life 
cycle. 

The Army Jiacro model utilized Putnam's technique, bur, 
at various time increments, would compare actual results 
with those predicted and, if the actual resources expended 
were statistically outside some preset control limits, 
corrective action would be taken. 
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Parr felt that Putnam's model did not take into account 
the effort that was completed prior to the actual starting 
date. He, therefore, proposed a model which would take this 
work into account in the early part of the project. It also 
correlated well with the work done by Norden and Putnam with 
the Rayleigh curve, both at the peak level and in the later 
stages. 

Lehman and Belady found in their study of the evolution 
of the OS/360 operating system programming effort that, as 
the size and complexity of each release which contained 
functional enhancements increased, so did the number of 
errors and, thus, the amount of maintenance effort also 
increased. Therefore, they postulated that for any system 
there is a time when it is better to restructure and consol- 
idate than to continue with additional enhancements, 

Jensen felt that Putnam *s model required some expansion 
and refinement. This he attempted to accomplish through the 
use of linear programming and graphical representation of 
his results. 



46 



III. MAINTENANCE COST ESTIMATING VIA THE GRSEN/ S ELBY MODEL 

The Green/Selby model includes two techniques: the first 
characterized by a macro approach and the second by a micro 
approach. The results of the application of both techniques 
to project planning parameters are compared and then weighed 
against managerial and organizational constraints to analyze 
tradeoffs and produce cost estimates. 



A. MACBC APPROACH 



The macro approach is concerned with man-loading across 
the life cycle of the project and, in particular, the main- 
tenance phase. The basis for this approach is derived from 
rhe relationships pioneered by Norden and further developed 
by Putnam. As was stated in chapter two, the various phases 
of the software project life cycle have been found, in 
general, to be characterized by the Rayleigh curve function. 
The function is written as follows: 



2 

-a t 

Y* = 2 Kate 
t 



(3.1) 



where 

Y* = manloading at any time t, normally measured in 
aanyears or manmonths, 

t = elapsed time from the start of the project, 
k = the total accumulative manpower utilized over the 
project life cycle, measured in manyears or 
manmonths, 
and 

a = the shape parameter of the curve. 
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Norden demonstrated that the shape parameter (coefficient), 
a, can be calculated by the equation: 



a = 




(3.2) 



where 



t^ = the point in time of maximum manpower utilization 
for the project. 

It must be noted here that t in equation (3.2) can, in 

d 

large projects (defined by Putnam as those projects with 

about 75,000 source lines of code [28]), be equated to 

project development time. In other words, large projects 

have historically been characterized by maximum manloading 

at the end of the development phase, roughly when the 

product was delivered to the user. However, it has been 

found empirically [29] that for other than large projects 

(less than 75,000 source lines of code) t actually falls at 

d 

seme point between t and the end of the development phase. 

r 0 

This may or may not affect the Green/Selby model. The end 



of the development phase will be denoted as t , 

dev 

facT does not coincide with t . Putnam has indicated 

d 

for small projects (less than 18,000 source lines of code) 



if it in 
that 



Y* is reached at about t 
max dev 

(18,000 - 75,000 source lines of code) reach Y' 



between t /V^ and t /2- [30] 
dev dev 

thesis, will be defined as the 
max imum. 

Substituting equation (3.2) 
the following equation: 



Medium sized projects 
somewhere 

X 

Therefore, t , in this 



time at which Y' reaches a 



into equation (3.1) gives 



2 -t /2t 
Y* = K/t te c 

d 



(3.3) 



as 



This equation can be used to calculate Y' at any point on 

the curve once K and t are known. The calculation or esti- 

d 

nation of K and t have been sufficiently dealt with in the 

d 

literature and so they will not be addressed here [31]. 

However, it must be noted that t = t at the point of 

d 

maximum manloading, and so, at that point, equation (3.3) 
breaks down to: 



- 1/2 

Y* = K/t e. (3.4) 

max 

Norden also stated that the Rayleigh curve exhibited an 
inflection point where the decrease in manpower usage slows 
down in the descending portion of the curve [32], as charac- 
terized by the equation: 



1/2 

t. = (3/2a) , (3.5) 

tp 



where 

t = the time of the inflection point of the Ravleigh 
curve, and 

a = the curve shape parameter 
The Green/Selbv model is based in the theorv that Y' 

tin 

can be defined as a maximum level of maintenance effort for 

a project. The minimum level of maintenance effort is 

defined bv Y* , the inflection ooint on the curve for 

tim 

the maintenance phase, which, for large projects in general, 

has been said in the literature to follow the Rayleigh 

pattern. The definition of t as a maximum level of mainte- 

-P 

nance was further supported by the hypothesis that the 

maximum level of manloading during the maintenance phase, 

Y’ , was equal to the manloading at the inflection point 
t m 

Y’ . This hypothesis appears to be based on the assumption 
tip 

that* the maximum point of the maintenance phase coincides 
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both in time and in magnitude with the inflection point of 
the life cycle curve. Green and Selby used the empirical 
data synthesized from a spectrum of USACSC projects to 
develop the theory. Figure (3.1) depicts their theoretical 
model. 




Figure 3-1 Normalized Rayleigh Curve 
B. aiCRO APPROACH 

The micro approach was developed by Green and Selby 
using raw manning data obtained from the IBM Federal Systems 
Space Shuttle Program and the unpublished papers of Mr. Kyle 
Rone of IBM. This approach uses a matrix technique coupled 
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i 



t 

I 





with work breakdown structures to project maintenance 
manning requirements. The raw data was synthesized by Green 
and Selby to fit the macro model and then compared with the 
results of the micro matrix method. The authors of this 
thesis were not able to obtain data of sufficient complexity 
and refinement to apply micro techniques to it, and, there- 
fore, the micro approach will not be discussed further in 
this work, 

C, PBOJECTED MODEL APPLICATIONS 

The Green/Selby model was presented as a management 

tool. The control concept coupled with the planning concept 

appeared to be a total maintenance strategy package for the 

project manager. The model could provide management with the 

determination of a maintenance support level by use of the 

inflection point predictors (Y* and Y’ ) to define 

tip tim 

maximum and minimum maintenance manpower utilization bounda- 
ries, These boundaries, coupled with a planning strategy, 
provide a powerful planning tool. 

Ose of the model was also projected for forecasting of 
resource distribution via integration techniques applied to 
the area of the curve under the maintenance support boundary 
to break out manpower required by separation of development 
work (enhancements, additions, new design) from pure mainte- 
nance work (debugging, design error correction) .[ 33 ] 

The model was finally projected as a device for moni- 
toring configuration control. Drawing on the work of Lehman 
and Belady, Green and Selby theorized that, as a project 
moves from pure "fix-it" type maintenance to modifications 
which may eventually lead to a new release of the product, 
the complexity of rhe product increases. This rise in 
complexity increases the maintenance level. As successive 
releases are developed, the maintenance level increases 
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until it eventually exceeds the original maximum maintenance 
support level of the product. This would then predicate 
management assessment of the viability of the project from a 
cost effectiveness point of view, as the project will have 
reached what Green and Selby called a maintenance budget 
saturation point. At this point, or earlier, depending on 
management policies and desires, the old project would be 
terminated and a new life cycle/Rayleigh curve started. 

D. CHAPTER THREE SUMMARY 

The Green/Selby model appears to provide an easy-to-use 
cost es+imation tool for the data systems manager. The macro 
and micro approaches give fairly quick estimates of mainte- 
nance manloading which can be cross compared and coupled 
with management constraints to fill out the system manager’s 
overall strategy. If valid, it seems to partially fill the 
void in data systems management, alluded to in the GAO 
report, that of the lack of a maintenance strategy in an 
organization where maintenance is considered a discrete 
function. 
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IV. MODEL VALIDATION 



A mathematical development of the Green/Selby model was 
completed by the authors of this thesis solely by algebraic 
substitution and reduction, working with the basic equations 
and relationships from the works of Horden and Putnam. An 
empirical development of the model was completed using the 
same or similar data to that used by Green and Selby. Both 
developments follow, 

A. MATHEMATICAL DEVELOPMENT 

The Norden/Rayleigh curve equation, as discussed 
earlier, is; 



2 

-a t 

T' = 2 Kate 



(4.1) 



This equation is characterized as a two parameter equation, 

as the outcome hinges on two parameters, K and a, calculated 

across the life cycle for all/any times from t to t . 

0 n 

The parameter, a, as used in the Green/Selby model, is 
calculated by: 



2 

a - 1/2t . (4.2) 

d 



The Green/Selby Model appears to have been developed for 

large projects with the assumption that t and t do 

dev d 

coincide. Therefore, if a is substituted into the 
Norden/Rayleigh equation, the commonly used form is found: 
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Y» 



2K«S'1/2t 



2 - 1/2 

a 



t 

a 



2 



2 



# 



which reauces to 



Y» 






2 

t ^6 




(4.3) 



Noraen noticea that the inflection point on the project 
life cycle curve is charact erizea by: 



1/2 

t = (3/2a) . (4.4) 

ip 



If the equation for a is substitute! in equation (4.4) 
reauces to: 



■f 



ip 



1/2 



1/2 



IP 



(3/2/2 t ) 

a 



= <3t^ , 



(t.5) 



Substituting this equation into equation (4.3) gives; 



2 2 

2 - ( (1/2t ) (t. ) ) 

Y* = 2K (1/2r ) t e d ip 

t a ip 

io 



which reauces to 



1/2 2 2 
2 2 - ( (1/2t ) (3t ) ) 

Y* = 2K(1/2t ) (3t ) e a ' ' a , 

t a a 

ip 
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which further reduces to 



Y* 

t 

ip 



-3/2 

1.73K/t ^e 
d 



(4.6) 



In the Green/Selby Model, it is theorized that the 

inflection point of the life cycle curve and the point of 

y* on the curve for the maintenance phase coincide. The 
max 

times t and t are the same absolute time; however, for 
ip m 

purposes of calculations, they differ, since t , the maximum 

m 

manning for the maintenance curve is calculated relative to 

the start time for maintenance or the t for the maintenance 

0 

curve. If development time is equal to t^, as was assumed in 

the Green/Selby Model, and if the maintenance effort starts 

at t , then the t for the maintenance curve is t for the 
d 0 d 

life cycle curve. Figure (4.1) , with a corresponding time 

line, demonstrates the general relationship. 

Green and Selby symbolized the elaosed time t to t as 

0 m 



t : 



t 

a. 




t . 

0 



(4.7) 



It is at this juncture that difficulty in the develop- 
ment arises. The difficulty lies in the definition of where 

the maintenance phase begins. Does it begin at t when the 

dev 

development phase ends as in Figure (4.1) , or does it begin 

sometime after that? The time to Y’ and thus, the shape 

max 

parameter, a, depend on that definition. Green and Selby, 
using Army Data, stated that, on the average, the mainte- 
nance phase began at time 1.3 with t normalized to 1 or 

d 

time (t + 0. 3t ) . Therefore, the estimate of t for mainte- 
d d d 

nance curve orojection, or t , will be as shown in Figure 

e 

(4.2) and eauation (4.8) below. 
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Figure 4, 1 Maintenance Phase Tiaing Relationships 



The estimate of K for the maintenance phase also came 
from the Army data vhich indicated that, on the average, the 
K for the maintenance phase is 20 percent of lifecycle K or 
0.2 K (lifecycle) with lifecycle K normalized to 1, 
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Figure 4.2 Maintenance Phase Timing Relationships in 
the Green/Selby Model 



Since it is theorized that t 

m 

Figure (4.2) that 



t , it can be seen from 

i? 



t = t . - (t + 0,3t J . (4.8) 
e ip d d 

It must be noted here t bar this development, because of 
the nature of the problem and the lack of firm data, cannot 
be a pure mathematical development; however, the attempt is 



57 



made to approximate it as closely as possible. Even though 

the t , or time of I ' , in the equations for the life 

d max 

cycle and maintenance curves denote the same type 

relationship within their parent equation, the quantities 

are necessarily different. As far as the authors know, and 

it is projected t.hat the case was the same for Green and 

Selby, no specific relationship between t (Ic) and t (m) 

d d 

have been found empirically. Therefore, for this development 

to exhibit credibility, known estimation factors from the 

Army data must be introduced. This also tends to indicate 

that until some firm relationship between t ’s is found, 

d 

general applicability will be lacking. The same applies for 
the K factor. 

After substituting the value for t. from equation 

ip 

(4,5), equation (4.8) becomes: 



2 

t = (3t ) - + 0.3t ) = 0.43t . (4.9) 

e d d d d 

Substituting the value for t (maintenance phase t ) into 

e d 

equation (3.4) for the Y’ of a curve gives: 

ma X 

- 1/2 

Y’ = K/t >e. 



which reduces to 

- 1/2 

Y' = 0.2K/0.43t *^8 
t d 

m 



(4.10) 



The constant e(-3/2), in equation (4.6) , is calculated to be 
0.223, and the constant e(-1/2) above is calculated to be 
0.507. They are substituted into equation (4.6) and (4.10) 
respectively to give: 

Y» = 1.73K/t *0. 223 or 

t d 

-? 
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= 0.38 6K/t 



and 



iP 



y* = 0,2K/0.43t ^*0.607 
t d 

m 

I* = 0. 121 K/0. 43t . 
t d 

m 



or 



Attempting to equate Y'^ to Y'^ produces; 

in m 



0.386K 

’d 



0.1 2 IK 
0.43t 



(4.11) 



(4. 12) 



(4. 13) 



Algebraic reduction carries the development to completion: 



0.43t 0.121K 

d 

and 

t 0.386K 

d 



0. 43 = 0 .121/0.386 (4. 14) 



which gives 



0. 43?40. 3 13. 

A similar development using K's and alone without the 

relational factors taken from Army project experience gives 

similar results. This is significant since it indicates 

that, for large orojects where life cycle t = t , the 

d dev 

manloading at the maximum point on the maintenance curve is 

not necessarily equal to the manloading at the inflection 

point on the life cycle curve. There are situations where, 

theoretically, with the right values for t , t , and the two 

d e 



K's, 


Y’ and 

tip 


Y* will be equal, 

tm 


but it becomes apparent 


that 


no such 


general rule can be de 


monst rated. 


Therefore, 


the 


proof of 


applicability, as has 


been the 


case in all 
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areas of software cost estimating research so far, falls 
back into the arena of empirical development. The empirical 
development used by Green/Selby follows in section B. 



B. EMPIRICAL DEVELOPMENT 

The present authors, in recreation of the Green/Selby 

model, developed it as follows. 

All parameters were normalized to values of t and K 

d 

equal tc 1. With t = 1 and equation (U.2) calculate a: 

d 



a = 1/2t = 0.5, 

d 



(4. 15) 



Substitute a into eauation (4.4) and calculate t ; 

ip 



1/2 

= (3/2a) = 1, 73 years, 

in 



(4. 16) 



Substitute t into equation (4.6) to calculate Y* : 

ip 

-P 



-a (t . ) 

Y' = 2Kat e ip , and 
t io 

ip 



-0.5 (1. 73) 

Y' = 2(1) (0.5) (1 .73) a , and 

1.73 



y' = 0.387 manyaars. 

1.73 



(4. 17) 



To equate maximum maintananca manloading to the life cycle 

inflection point, define the time of maximum maintenance 

as t . Thus, 
m 



Y* 



-P 



Y' 

t 

m 



(4. 18) 
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0.3. army Computer Systems Command project data indicated 

that, across the spectrum of Army software projects, the 

maintenance phase included about 20 percent of the life- 

cycle, Therefore, K for the maintenance phase is 0.2K with 

respect to the normalized life cycle K value of 1, Here, it 

must be assumed that Array data analysis is valid. However, 

it is the contention of the authors of this thesis that an 

average of all Army large scale software projects will give 

a good figure for k/t for their types of projects. Army 

d 

data also indicated that the maintenance phase started at 

1.3 years normalized time (t ) . If Y* . = Y* at t , 

0 tio tm ip 

then, making the same assumption as Green/Selby, that 

t = t , the time of maximum maintenance manloadina , t , 
ip ro ' e 

can be calculated by: 



m 





and 



1,73 - 1.3 = 0.43 years. (4.19) 

Calculate a for the maintenance curve from equation (4.2); 
m 



2 

a = 1/2t = 2.71. (4.20) 

m e 



Substitute a and t into equation (4.1) to calculate Y' : 

e t 

m 



Y* 

t 

m 



2 

- (2.7 1 (0. 43 ) ) 

Y‘^ = 2{0.2K) (2.71) (0.43) e , and 

m 



Y* = 0.2824. (4.21) 

m 



- (a (t 
= 2Ka t e a ' 
m e 



)) 
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Ds9 equation (4.4) to calculate t 

im 



1/2 

t = (3/2a) = 0.74 years. (4.22) 

im 



The maintenance curve inflection point, t , 

im 

cycle basis, normalizes to 2.04 years. Substitu 
equation (4.6) to calculate Y’ : 

im 



on a life 

e t into 
im 



2 

- (a (t . ) ) 

Y» = 2Ka t e m im 
t m r m 

im 



2 

-(2.71) (0.74 ) 

Y’^ = 2 (0. 2K) (2.71 ) (0. 74) e , and 

im 



= 0.182. 



(4.23) 



im 



The normalized curve as developed above is depicted in 
Figure (4- 3) . 

Here, Y' is clearly not equal to Y* , as was also 
ti D tm 

found in the mathematical development, but rather, Y’ is 

tm 

about 25 percent less than Y’ in magnitude, when t and 

tip m 

t coincide, 
ip 



C. CHAPTEH F0U2 SOHMAHY 

In both the mathematical development and the empirical 
development, maximum manloading for the maintenance phase 
and manloading at the inflection point of the life cycle 
curve were not found to be equal. However, the maintenance 
maximum was below the magnitude at the inflection point. 
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max 




Figure 4,3 Developed Normalized Curve 



Therefore, though the Green/Selby theory, in itself, a>ay not 
be substantiated, some relationship/s may exist which can be 
used for maintenance manpower estimates. The key relation- 
ships in any maintenance manloadinc estimates appear to be 
those of life cycle K versus maintenance K and life cycle t 

d 

versus maintenance t . If some empirical relationship (such 

d 

as, for all large projects maintenance t is X percent of 

d 

life cycle t or maintenance K is X percent of life cycle K) 
d 

can be determined, then a model development could possibly 
be completed which produces fairly accurate manloading esti- 
mates. Such a model would not necessarily hinge on Y* = 

. . tip 

Y’ but rather some relationship such as that exhibited by 
t m 

overall Armv project data where Y’ or maximum average 

tm 



63 



maintenance level fell at about 75 percent of Y' . The 

tip 

difficulties encountered in attempting to develop the theory 

mathematically, in respect to ifferences in K*s and t 's, 

d 

suggest that there may be other factors affecting the 

and the parameters that determine those 



relation shins 



relationships. Such factors are discussed in Chapter VI. 
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V. 



R ESEARCH ANALY SIS 



A. DATA DEFINITION 

The data utilized in the research effort was received 
from two sources, NASA Goddard Space Flight Center, 
Greenbelt, Md., and Dr. W ilia Ehrlich, Bankers Trust Co., 
NY., NY. Both sets of data consisted of manloading for soft- 
ware projects over the life cycle and included maintenance 
data. Manpower utilization figures were in manhours for the 
NASA data and manmonths/mth for the Bankers Trust data. The 
NASA data was converted to man months /mth prior to analysis. 
The projects analyzed will be called NASA project and 
Projects A-D for the purposes of this thesis. 

1 . Bankers Trust Co . D ata 

Projects A-D were all medium sized projects, devel- 
oped at Bankers Trust Company. The few project character- 
istics that were known can be found in Table V. A listing of 
project data by manmonths/mth is found in Appendix C. 

2 . NASA data 

NASA project data were related to an operational 
system and, though it is an ongoing project and the complete 
life cycle is not yet known, much information could be 
synthesized from the life cycle and maintenance data to 
date. Pertinent project characteristics are listed in Table 
VI. It is readily apparent that the project started as a 
small project, but that it has migrated via maintenance to 
what could be called a large project. However, based on 
project size at the end of development, it must be classi- 
fied as a small sized project. A listing of project data by 
manmonths/mth is found in Appendix C. 
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TABLE V 



Bankers Trust Co. Projects Characteristics 



— 

Project Name 


Size 


De velooment 


Maintenance 


1 

Ending 


A 


Medi urn 


8/78 


1/80 


12/80 


B 


Medium 


8/79 


6/80 


4/81 


c 


Medi urn 


12/76 


4/78 


12/78 


D 


Medium 


3/77 


11/77 


1 2/79 

1 



B. ANALYSIS PROCESS 

The analysis process fell into two categories, curve 

fitting, and comparison. Actual life cycle manmonth figures 

for individual projects were fitted against the Rayleigh 

equation via the facilities provided for non-linear curve 

fitting in the Statistical Analysis System (SAS) package 

available on the resident IBM 3033AP Computer System. The 

Marguardt method was chosen as the regression technique. In 

addition, data from the four Bankers Trust Co. projects were 

combined by normalizing t (the time to reach Y’ ) to 1 

d max 

for each project and then the curve fitting techniques were 
applied to the normalized/ combined data. Manpower figures 
for the maintenance phases of individual projects and the 
combined data were also fitted to the Rayleigh equation and 
then, in each situation, actual data points and fitted 
curves for life cycle and maintenance phases were replotted 
on a common axis to provide an aggregate picture of the 
phase relationships. 

The DSACSC data was also reanalyzed. Though it did not 
provide subsxantiation for the specific theory of Green and 
Selby, as noted in chapter four, it does provide valuable 
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TABLE VI 



NASA Project Data Characteristics 



PROJECT HISTORY 


A. Design start date 


March 


1, 1975 


B. Kaintenance start date 


July 


30, 1977 


C. Date of last data 


January 


25, 1982 


CODE HISTORY 


A. Lines of Code 






1. Original lines of code 




4,000 


2. Modified lines of code 




8,141 


3. New lines of code 




6 1, 230 


4. Total lines of code 




73, 371 


B. Modules 






1. Original modules 




35 


2. Modified modules 




75 


3. New modules 




450 


4. Total modules 




560 


C. Documentation 






Pages 




3,300 



insight into the phase relationships as applied to large 
sized projects. A mass of raw data was not available, but by 
taking nhe aggregate figures provided, critical points along 
the Rayleigh curve were calculated. 

After the curve fitting was completed, the parameters K, 
a, and t_ for the life cycle curves and the corresponding 

A 

maintenance curves were compared to examine possible common 
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relationships. Curve magnitudes at t for the life cycle 

io 

and Y* (t ) for the maintenance curve were also compared 
max d 

in terms of the general relationships proposed by Green and 
Selby. 

C. ANALYSIS RESOLTS 

An excellent fit was obtained for the life cycle curves 
for all five individual projects in relation to the Rayleigh 
model. From Table VII, correlation coefficients ranged from 
r2 = 0.776, for the NASA project, to r^ = 0.966, for Project 
A. The curve fit for the combined Bankers Trust projects 
obtained an r^ = 0.869. However, maintenance curves, in 
general, did not fit the Rayleigh model well, with correla- 
tion coefficients ranging from r^ = 0.118 for NASA data to 
r2 = 0.762 for Project B. Projects B and D maintenance 

curves best fit the Rayleigh model with r^ = 0.762 and 0.747 
respectively. These findings indicate that the maintenance 
efforts are somewhat erratic, as alluded to in the GAO 
study, and, therefore, do not fit a specific curve well. 
Mhen maintenance is not managed as a discrete function, 
manloading peaks and drops in an inconsistent manner. This 
normally results as managers respond, on a crisis basis, to 
provide maintenance activity only when trouble arises. 

In the NASA data, however, though the overall mainte- 
nance data does not fit the Rayleigh curve well, visual 
inspection of the curve reveals what appear to be a series 
of small Rayleigh- like curves, the combination of which 
exhibit an overall rise of maintenance manloading across the 
available data, as can be seen in Figure (5.1). 

This trend fits well with 'the project character istics which 
show that the size of the project has grown from 4000 SLOG 
to about 73,000 SLCC during its life cycle to date. It 
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NASA PKOJFCT RAYLEKiH CuftVES 



r" 




Figure 5. 1 



NASA Data 



Fitted Curves 
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stands to reason that the ''mini-development cycles" for 
those modifications/enhancements which created the increase 
in system size would, themselves, exhibit a Rayleigh 
pattern, but the aggregate maintenance phase would not 
necessarily follow the same pattern. The aggregate curves 
are included in Appendix c. 

Comparison of parameters gave varying results, as can be 

seen in Table VII. Ratios of life cycle K's to maintenance 

K's ranged from 0.148 to 1.24 and ratios of life cycle 

and maintenance t 's ranged from 0.625 to 2.82. This seems 

d 

to indicate that no general relationship can be derived 

which relates K's and t *s for the maintenance phase versus 

d 

the life cycle with respect to individual projects. However, 
as more data is accumulated and research efforts continue, 
those relationships might be found to exist for various 
aggregate projects. 

When Y' of the individual fitted life cycle curves 
tip 

was compared to Y’ of the individual fitted maintenance 

t m 

curves, similar results to those obtained for K and t 

d 

comparisons were observed. The ratios covered a wide spec- 
trum. However, when the comparison was made for the 
combined Bankers Trust projects curves, the results were 
strikingly similar to those of the NASA project and the 
OSACSC data. OSACSC data indicated, as shown in Chapter IV, 

that, on the average, Y' = 75 percent of v' . Comparison 

tm * tin 

of actual maximum manloading for the combined Bankers Trust 

project data to the inflection point on the fitted life 

cycle curve gave Y' = 69.6 percent of Y’ . Thcuah onlv 

tm tip 

one project, instead of an aggregate, the 51A3A data also 

showed a general behavior of Y’ = 69 percent of Y* 

rm ■ tip 

For the NASA project, this interpretation may be 
questionable, since some data points lay above the 59 
percent of Y' . . level. In fact, one point lay above Y' . . 
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TABLE VII 



Compilation of Analysis Results 





Life C 


ycle Parameters 






NAME 


a t 

d 


K 


I' 

max 


y 

tip 


tip 


NASA Project 


.003969 11 


28.410 


1.54 


0.9982 


19. 44 


Project A 


.007143 8 


183. 374 


13. 27 


8.8586 


14.49 


Project B 


.014294 6 


137.276 


14.08 


8.8422 


10. 25 


Project C 


.007605 8 


136. 913 


13.98 


9.0296 


14. 04 


Project D 


.0242 88 5 


81 . 383 


10.77 


6.2905 


7.86 


Comb'n A-D 
(norm. td=1) 


.598560 1 


19.435 


12.89 


3.2190 


1. 58 




Maintenance 


Phase Parameters 






NAME 


a 


0 


K 


Y' 

max 




NASA Project 


.0 005 25 


3 1 


35.234 


0.693 




Project A 


.0 224 20 


5 


27. 165 


3.477 




Project B 


.0 19000 


5 


47. 204 


5.579 




Project c 


.006000 


7 


53. 127 


4.000 


1 


Project D 


.005900 


9 


56.699 


3.740 




Comb'n A-D 
(norm. td=1) 


.3 11000 


1. 26 


8.480 


4.080 






Miscellane 


ous Parameters 






NAME 


td(M) 


Ji-L- 


I' tm 


Main 
Corr . 


Life 

Cvcle 

Corr. 




td(LC) 


K(LC) 


Y ' tip 


NASA Project 


2.820 


1 .24 


.694 


.118 


.776 


Project A 


0.625 


. 148 


. 392 


.511 


.966 


Project B 


0.833 


. 343 


.631 


.762 


. 872 


Project C 


0.875 


. 284 


. 443 


.482 


.939 


Project D 


1.800 


.6 96 


.595 


.747 


.893 


Comb'n A-D 
(norm. td=1) 

< 


1.260 


.436 


. 496 


.388 


. 86 9 

1 
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However, if one accepts the theory that the NASA project is 

characterized during the maintenance phase by a series of 

"mini-development phases", then the points above the 69 

percent level can be interpreted as manning levels intrinsic 

to the development effort and not characteristic of a 

general maintenance program. Then the aggregated maximum 

maintenance level lies at 6 9 percent of Y* . , 

txp 

D. CHAPTER FIVE SOMHARY 

The data were analyzed using non-linear curve fitting 
techniques to provide life cycle versus maintenace phase 
relationship comparisons. The results seem to exhibit inde- 
pendence of behavior with respect to values of K and t^. 
However, a general trend, within the limited scope of data 
available, was found which appears to point to a possible 
relationship between maintenance manloading levels and the 
magnitude of the inflection point on the life cycle curve. 
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VI. CONCLUS IONS MD RBCO MM BNDATIONS 

A. INTRODUCTION 

The history of the software industry has been marked by 
cost overruns, late deliveries, poor reliability and mainte- 
nance, and user dissatisfaction. while these problems are 
not unique to computing, the record seems to indicate that 
software developers as a group are less successful in 
meeting quality, cost, and schedule objectives than their 
hardware counterparts .( 34 ] With this in mind, a number of 
models have been developed, as discussed in Chapter II, to 
provide management the necessary tools to more accurately 
predict the actual costs and time frames for their software 
projects. This thesis attempted to expand the work done by 
Green and Selby on Putnam's model, with special emphasis on 
the maintenance phase of the software life cycle. This 
included a detailed comparison of the peak manloading for 
the maintenance phase with the inflection point on the total 
life cycle Rayleigh curve. 

B. CONCLUSIONS 

The software project manpower macro-estimating model, as 
presented by Green and Selby, is not a usable model for the 
project manager. As was demonstrated in Chapter IV, and 
again in the data analysis in Chapter V, the maximum point 
on the maintenance curve is net necessarily equal to the 
magnitude an the inflection point of the life cycle curve, 
though, theoretically, in is possible for the two points to 
be equal. It was also found that the absolute point in time 
of the maximum maintenance manloading and the inflection 
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point may coincide, but, usually, will not. However, these 
findings do not invalidate the basic ideas from which the 
Green/Selby model were developed. Those basic ideas were 
that a relationship may exist whereby maintenance manpower 
could be projected by comparison of the maintenance phase 
and life cycle Rayleigh curves, or derivations thereof. It 
was shown that, within the scope of the limited available 
data, only two of the five projects analyzed were character- 
ized by maintenance phases which closely fit the Rayleigh 
model. However, it was demonstrated that, for combined 
project data, within project type, and within a specific 
organization, a relationship does appear to exist between 
the maximum maintenance manpower utilization level and the 
inflection point of the life cycle curve, whether the main- 
tenance phase fits the Rayleigh model or not. 

In both the USACSC and combined Bankers Trust Co. data 

analyses, and with interpretive license in the NASA data 

analysis, maximum maintenance levels were within 65 percent 

to 75 percent of the level of Y' . There is not enough 

tip 

evidence here to show that there exists a general rule that 
maximum maintenance will be about 70 percent of the magni- 
tude at the life cycle curve inflection point, but the 
implications for project managers within individual organi- 
zations are encouraging. The results of the data analysis 
appear to indicate that, for project type, within an indi- 
vidual organization, analysis of historical data and compar- 
ison of maintenance levels to life cycle curve inflection 
points will provide a general baseline maximum maintenance 
support level which the manager can use in outyear mainte- 
nance manning projections for future projects. For example, 
if historical data for accounting type projects in organiza- 
tion X shows that maximum maintenance manning is 65 percent 
of the magnitude at the life cycle curve inflection point. 
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then the manager can apply that percentage tc the projected 
life cycle curve calculations for future projects to obtain 
a maintenance support projection at the inception of the 
project. As the life cycle curve is refined during the 
development phase, the maintenance level projections can be 
successively refined. This would provide the ADP manager 
with a valuable tool in an environment presently character- 
ized by a general lack of planning and management direction, 
in the area of software maintenance. 

The results of the data analysis further indicate, by 
their lack of strong correlation, that there are other 
factors which may have a strong effect on the level of main- 
tenance required for any software system. This finding is 
not entirely surprising, as the authors of this thesis, 
after extensive readings in the literature, did not have 
much confidence in the possibility of discovering a single, 
general, simple decision rule for software maintenance 
manning. Rather, the research completed here is only a tiny 
bite taken from the mountain of research which needs to be 
done. The possible set of constraints and combinations 
thereof which affect the software process is astounding. A 
few were highlighted by this research effort. It was found 
that there was no firm relationship between K's and t 's of 
the corresponding life cycle and maintenance phase curves. 
It can be hypothesized that differences in K’s (total life 
cycle manning) are attributed to such factors as project 
size, complexity, and project type. It follows that larger 
projects will require higher overall manning levels than 
smaller sized projects. The relationships of maintenance t^ 

versus life cycle t are affected, in large part, by 

d 

complexity and size of the project. Differing system 
complexities may place heavier burdens on different phases 
of the development processes, and, thus, cause t (time of 
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maximum manning) to occur at different times for different 

projects. There may be, and the authors of this thesis feel 

that there will be, no definable relationship between the 

point of maximum manning for the maintenance phase and the 

corresponding td for the life cycle. Since only two of the 

five projects analyzed actually fit the Rayleigh model for 

the maintenance phase, it would appear that for some 

projects, a definable t would be forever elusive. Only in 

d 

those projects where some type of "mini-development'* effort 

is completed in the process of providing enhancements or 

major modifications will a good fit to the Rayleigh model be 

realized, accompanied by a definable maintenance t versus 

d 

life cycle t relationship for that project, 
d 

k constraint of even greater importance is the use of 
varying software development techniques and methodologies. 
It has been speculated that the majority of research to date 
has been conducted with data collected from projects which 
were characterized by design and coding efforts which did 
not include structured, modular-design techniques, informa- 
tion-hiding modules, and other software development concepts 
and tools. These projects have shown a very close relation- 
ship with the Rayleigh model. A tremendous impact on the 
entire arena may be seen with the increased use of the above 
listed design techniques. How these techniques will affect 
the software equation and, in particular, software mainte- 
nance, is yet to be seen. 

The rise in maintenance activity for the RASA project, 
as new developments apparently added modules and source 
lines of code to the system, seems to support the results 
obtained by Lehman and Belady, as described in Chapter II, 
that, as enhancements are added to the original project, the 
maintenance level required to support the project also 
rises. This could be attributed to the fact that the 
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addition of enhancements adds complexity to the system 
which, in turn, causes a resultant increase in the mainte- 
nance level required. As was discussed earlier, and as is 
seen in the NASA project data, if enhancements continue, the 
maintenance manning rises above the magnitude of the inflec- 
tion point on the life cycle curve. This could also indi- 
cate that the point in time at which the project should be 
totally rewritten and restructured as a new project has been 
reached, and any further development-like effort on the 
system should constitute the inception of a new project. 

C. RECOMMENDATIONS 

One of the most difficult problems encountered in the 
preparation of this thesis was locating organizations which 
had compiled and/or retained historical data from their 
software development and maintenance efforts. Some of the 
organizations contacted bad maintained some form of histor- 
ical data, but they had not broken their information down 
into a format which could be used to obtain information 
about the separate phases of the software life cycle. 
Therefore, if any meaningful research is to be conducted in 

the future in this area, organizations which are responsible 

for producing or maintaining software products need to start 
accounting properly for the various costs associated with 
this process. Proper accounting includes, not only tracking 
the number of source lines of code produced for the project, 
but total man-hours expended in each phase, the actual time 
frame for each phase, and the applicable complexity factors. 
The collection of this data, however, must be an ongoing 
process, just as is proper documentation of software, and it 
should become a part of this documentation. By making the 
collection process an ongoing process, the data is always 

current, and less subject to error. For, like any other 
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form of documentatioDr if postponed until the end of the 
project, it is subject to a host of errors, omissions, and 
inaccuracies. However, even if the collection process is 
done with total perfection, it means nothing unless the data 
is recorded in such a manner that it can be retrieved and 
understood easily. It is therefore recommended that this 
data be stored in an automated data file so that it can be 
accessed quickly and analyzed with greater ease and effi- 
ciency than with a manual system. With the cost of software 
rising at an ever increasing rate, the benefits of this 
information to the organization, seem obvious. Not only 
should it be better able to predict future software manning 
requirements, but also, it should be able to identify and 
correct other inefficiencies within the development and 
maintenance processes. 

As noted by GAO, and as indicated by the NASA data, a 
generally accepted but uniform definition of software main- 
tenance is not now in existence in the majority of organiza- 
tions. In addition, management is not presently requiring 
that software maintenace be managed as a discrete function. 
This leads to many problems for management at various levels 
of the organization. As such, it is recommended that the 
definition proposed by GAO be adopted as the uniform defini- 
tion of software maintenance. It also is recommended that 
software maintenance be accomplished as a discrete function 
within the organization. The adoption of the GAO definition 
will leave a grey area where enhancements to the old project 
stop and a new project begins. However, if management 
formulates a project maintenance strategy which includes the 
development of a maintenance support level, whether it is 
based on a percentage of the magnitude at the inflection 
point on the life cycle curve, or on some other management- 
defined function, a point will exist above which management 
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should decide to terminate enhancements to that project and 
start a new project. This project would be developed as a 
follow-on to the old system. The old project should be 
terminated or continued with a minimum maintenance support 
level to effect necessary repairs until the new system comes 
online. 

Although there appears to be a strong correlation 

between peak maintenance manloading and a fixed percentage 

of the manloading at the inflection point of the total life 

cycle Hayleigh curve, further work needs to be done to 

determine if this relationship holds true throughout the 

software industry. This work should include comparisons 

across all types of software and comparisons within each 

class to determine if there is a value that management could 

use as a planning tool for the type of software they are 

producing. Follow-on research to this thesis would be most 

beneficial if completed in the following manner. A larger 

base of life cycle/maintenance data must be collected to 

provide a better picture of the relationships concerned and 

to obtain a higher percentage of validity in the findings. 

Projects need to be analyzed individually, grouped by 

project size, grouped by type of system involved, grouped by 

complexity factors (if known), and grouped within specific 

organizations as well as a total combination of the 

collected population. Research should be done to examine 

potential relationships of K*s, t 's, and Y’ versus Y’ 

d tin tm 

for the corresponding life cycle and maintenance curves. A 

particularly important area of research will be the effect 
of new software development techniques on the software equa- 
tion, Any data collected on projects which were developed in 
this manner should be segregated and analyzed separarely. 
The potential for research in this area is unlimited in 
scope and in promise. 
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AP PENDI X A 



ANALYSIS OF SOFTW ARE MODELS BY THIBODE AU 

A. INTRODUCTION 

Robert Thibodeau, while working for General Research 
Corporaton, was contracted by the Air Force to conduct a 
study of the various models currently avalilable for soft- 
ware cost estimation. This appendix consisrs of excerpts 
from his review, 

B. AEROSPACE MODEL 
Des cr iptio n of t^e Mode l 

The model was developed using regression techniques 
applied to data from software development projects charac- 
terized by one-of-a kind computers, limited support soft- 
ware, software, special languages and severe memory size and 
speed requirements. The data were stratified into two 
groups. One group contained 13 projects for the development 
of real time software identified as primarily large-scale 
airborne and space applications. The second group consisted 
of 7 operational support programs presumably without the 
size and speed requirements of the first group. 

The model description is not clear concerning the exact 
composition of the estimate of effort required to develop 
the software. Only the total effort is extimated. The 
estimate is made using a relationship of the form: 

b 

MM = a (Instruction) 

where the constants, a and b, are determined by regression 
analysis. 
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The estimating relationships are: 
R eal T ime Softwar e 



0.94 

MM = 0.057 (I) 



S uppor t Software 



MM = 



0.404 
2.012 (I) 



where : 

MM = total development effort, manmonths 
I = number of instructions (independent of 
language) .... 

C. DOD MICRO ESTIMATING PROCEDURE 



Description of ^e Mode l 

The primary estimating relationship comprising the DoD 
Micro Procedure can be described as the ratio of a factor 
representing the software to be developed or changed and a 
productivity measure. 

The model form suggests that effort increases directly 
with the number of input and output configurations operating 
on the system being built. Effort also increases with the 
number of routines being created or modified weighted by 
their difficulty. The total effort is scaled according to 
the amount of work that must be done in entirety as opposed 
to modification of an existing system. 

The number of days needed to deliver the product (effec- 
tively the days of effort per unit of product) depends on 
the general experience and accomplishment of the development 
group (measured by their job classifications) weighted by 
their knowledge of the problem to be solved relative to the 
knowledge required. One other factor that directly affects 

the productivity is the ease of access to the computer 
(measured by turnaround time) . 
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the basic form of the estimating relation for software 
development time is: 

Net Development Time = (Product) / (Productivity) 

Where: 

Product is a measure describing the effort to be per- 
formed. 

Productivity is the rate of creating the product from 
the application of personnel time. 

Product = (Number of Formats Weighted Number of 

Functions) x (Effort Relative to a New 
Deve lopmen t) 

The terms in parentheses along with the following terms 
are defined in the discussion of model inputs below: 

-1 

(Productivity) = (Work Days per Unit of Product for a 

Staff with Average Experience) 

X (Job Knowledge Required) 

X (Job Knowledge Available) 

X (Access) 

The result is the total hours required for code develop- 
ment. Presumably this means detailed design, coding, and 
unit testing. 

Gross Development Time = (Net Development Time) 

X (Other System Factor) 

X (Non-Project Factor + Lost 
Time Factor) 

A value of 1,8 is recommended for the other system 
factor. This factor represents the effort needed to convert 

the code development time to total development time. This 

value is representative of an observed range from 1,2 to 
2.1. Total development includes analysis, design, coding, 
testing and documentation. It is the sum of the project 
direct charges. Whether this includes support hours for 
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clerical and other functions is not clear. but any given 
organization could include these by modifying the 1.8 
factor. 

The net development time accounts for the time lost from 
normal scheduled working hours for leave, sickness, holi- 
days, and non-project assignments. These add 25 percent to 
the total development time. There is also a 10 percent 
efficiency factor (coffee breaks, time cards, code rework, 
etc.). The code rework should probably be handled else- 
where. It is probably included where it is to make the 10 

percent palatable. It should be included in the gross size 

adjustment and the 1.8 factor. 

The effect of these adjustments is to estimate the 
number of personnel who must be assigned to the project to 
ensure delivery of the total development hours. These 
factors are orgainizational specific. 

Although the resource estimating procedure includes 
weighting factors for the input and output formats by type 
of device (see subsequent discussion) , the factors have a 
value of one in each case. Therefore, the model describes a 
linear relationship between the total number of file formats 
and the effort required to implement them. It may be that 

future versions of the model will weight the types of file 

device differently. Then the effort required to implement a 
report format may be different from the effort required for 
a card format. 

Program complexity, which is the second term in the 
product measure, is the weighted sum of the functions to be 
implemented. The weights depend on the function and its 
assumed level of complexity. The weights range from 1 for a 
simple operating system control language change to 12 for a 
very complex edit-validation function. 
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The value 3 is the most common among the 24 possible 
function-complexity assignments. If the function types are 
equally represented in programs, the average value is 4. 

The programmer/analyst experience factor is an indica- 
tion of the effect of experience on productivity. Values 
range from .75 to 2.75 corresponding to a lead analyst to 
programmer and interns respectively. Since experience is 
not evenly distributed over a group of programmers and 
analysts, the following groups was hypothesized in order to 
obtain an average or representative value for the experience 
factor. 



Experience 


Number 
in Group 


Fac tor 


Weighted 

Sum 


lead 


1 


.75 


.75 


Senior 


2 


1. 25 


2. 50 


Journeyman 


4 


1. 75 


7.00 


Nominal 


8 


2.25 


18.00 


I nt ern 


5 


2.75 


13.75 


Average Value = 


20 

42 / 20 = 


2 . 1 


42.00 



No definitions are provided for the 10 job classifica- 
tions. The job knowledge and turn-around time factors are 
self-explanatory. 

The System Factor adjusts the product development effort 
to account for work alrea dy done. The product measure 
resulting from the format count and the program complexity 
value is the same whether the system is being developed in 
its entirety or it is a modification to an existing system. 
The system factor has the effect of modifying the product 
value to account for less than total development. 

Seven levels of change are described by the System 
Factor. The values range from 2 for a new development to 3 
for an operating systems control language change. 
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For a new system development the 2 in the primary esti- 
mating equation is divided by a System Factor value of 2 and 
the product measure is unchanged. Consequently, the System 
Factor values describing lesser amounts of new development 
have larger values and are portions of 2. The effect of the 
system Factor on the product measure is summarized as 
follows: 



Effort Relative to 



Type of Effort Syst 


em Factor a New 


Development 


New Development 


2 


1.00 


Major Change 


3 


.67 


Major Modification 


4 


o 

if) 

t 


Minor Modification 


5 


.40 


M aintenance 


6 


.33 


Minor Technical Change 


7 


. 29 


Operating Systems 






Control Language Change 


8 


.25 


In order to get a feel 


for the relative 


magnitudes o 



the components of the Micro Estimating Procedure, consider 
the following example. 

Number of I/O formats =10 

Number of functions = 20 

Average complexity factor =4. 

New Development 

Product = (Number of Formats + Weighted Number of 
Functions) x (Effort Related to a New 
Development ) 

ProducT = (10 + 4 X 20) x 2 / 2 = 90 
Experience - 2. (See above for computation) 

Job knowledge required = 1,0 
Job knowledge available = 1.0 
Access = = 1.0 

(Productivity) = (Work Days per Onit of Product for a 

Staff with Average Experience) 
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X (Job Knowledge Required) 

X (Job Knowledge Available) 

X (Access) 

= 2.0 X 1.0 X 1.0 X 1.0 = 2.0 

-1 

Set Development Time = (Product) x (Productivity) 

=90 X 2.0 = 180 Man-Days 

If the effort was a major modification (System Factor = 
4), the Product value becomes: 

product = ( 10 + 4 X 20) x 2/4 = 45 
and 

Net Development Time =45 x 2.0 = 90 Man-Days 

If the Job Knowledge Required is "Detailed'’ (Factor = 

1.5) and the Job Knowledge Available is "Limited" (Factor = 

1.5) , and the productivity becomes: 

-1 

(Productivity) = 2.0 x 1.5 x 1.5 x 1.0 = 4.5 
then for the major modification: 

Net Development Effort = 45 x 4.5 = 202.5 Man-Days 
outputs 

The primary output (i.e., the output that is sensitive 
or controlled by project variables as opposed to the subse- 
quent step which is a fixed allocation) is: Gross 

Development Time (man-days). Gross Development Time 

includes: 

* Nonproject time (individual assigned to project but busy 
with nonproject tasks, e.g. , training, nonproduct admin- 
istrative duties, etc., and vacation and holidays) 

• Masted or lost time 

therefore. Gross Development Time describes the sraffing 
level than will result in a needed amount of development 
time. The latter is predicted by program and project 
characteristics . 
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The secondary outputs (i.e. , those derived by applying 
fixed values to the primary output are; 

• Effort by project phase 

• Total development cost 
The project phases are; 

• Review and analysis 

• Design 

• Programming 

• Testing 

• Documentation 

Gross Development Time includes; 

Analysis of present methods 
Design of the new/changed system 
Develop the system's support 
Program design 
Program development 
Program testing 
System testing 
Installation and conversion 
Staff training 
Project officer 
System manager 
Technical managers 
Support personnel 
Documentation 
Input s 

P roduc t R elated Input s . The software is described by 
the numbers of types of items it processes and the numbers 
of functions it includes. The functions are described 
according to type and complexity. The result is two product 
descriptors; one measures the size of the input/output 
processing to be executed by the system; the other is a 
measure of the number and difficulty of the functions to be 
performed. 
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The number of different formats to 



I nput File Format s, 
be read by the system are counted and added together. The 
model asks for numbers of card, tape, disk, and screen 

formats separately, but since the weighting factor is always 
one, there is no distinction made among them regarding the 
effort involved to implement them. 

Output File F or mats. The formats output by the system 
are totaled. The same entries as for the inputs are 
requested plus the number of report formats. As in the case 
of the inputs, the weighting factor for the different types 
of output is always one, so there is no reason to 
differentiate. 

P rogr am Comp l exity . The total program complexity 
measure is computed by a weighted sum of the number of 
processing functions of given types. Each function is char- 
acterized as simple, complex, or very complex. The 

processing functions are; 

• Edit Validation 

• Table Look-Up (Internal or External) 

• Calculations 

• Sort/Merge Process 

• Internal Data Manipulation 

• File Search 

• Utilities or Subroutines 

• Operating Systems Control Language 

Job Knowle dge R eguir ed. The amount of knowledge 
required to implement or change a system has a direct effect 
on the number of hours required to accomplish the project. 
A system that requires very detailed knowledge will require 
more effort than one that can be arcomplished with limited 
knowledge. This parameter is paired with the job knowledge 
available factor described below to describe the relative 
influence on productivity. Three job knowledge levels are 
used; Limited, General, Detailed, 
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Sys t em Factor . The effort required to complete a system 
development or change project of given complexity depends on 
the state of the system. That is, the work required to 
develop a system with three file formats, all other factors 
being equal. The System Factor describes the level of 
effort being undertaken. Seven levels are described: 

• System development 

• Major changes 

• Major modification 

• Minor modification 

• Maintenance 

• Minor technical change 

• Operating systems control language 
Resource R e lated I npu ts 

P rogrammer/Anal vs t Expe rience Avai lable . The available 
experience measure is an effective productivity indicator. 
It quantifies the rate at which the product can be produced 
in terms of the job classification of the staff available 
for assignment to the system development. Two data 
processing personnel classifications: Analyst and 

Programmer, are tabulated according to five levels of expe- 
rience: Lead, Senior, Journeyman, Nominal, and intern. 

Weights are associated with the difference experience 
levels. The result is a weighted average productivity 
factor. 

Job Knowledge Availabl e . This factor has the effect of 
describing the change in productivity associated with the 
level of knowledge about the work to be performed that 
exists among the persons available for assignment. It works 
together with the Job Knowledge Required factor described 
above to quantify the effect of the knowledge of the system 
required compared to that available on the tine required to 
complete the work. In general, the effect of the combined 
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factors is to increase the development manhours if the need 
exceeds the available and decrease the hours if the avail- 
able exceeds the need. Three levels of job knowledge avail- 
ability are specified: Limited, General, and Detailed. 

Progra m Turn-Around Time. The effect of computer access 
on productivity is described by four levels of average 
turn-around time: 

• Interactive terminal 

• Mere than one run per day 

• One run per day 

• Less than one run per day. 



D. DOTY ASSOCIATES, INC. 



Description of the Model 

The model is actually a set of 15 estimating relation- 
ships. Each one to be used for a given type of software and 
software life cycle phase. Equations have been derived 
empirically using regression analysis for the following 
types of software: 

• Command and Control 

• Scientific 

• Business 



• Utility 

The development effort for software representing each of 
the application types may be estimated using one of three 
different relationships. An additional three are given that 
are applicable to all types of software. These equations 
are to be used "when the application cannot be categorized 
or is different than the categories noted". The procedure 
specifies that when a software system is made up of subsys- 
tems that are different types, the total size should be 
divided into the four categories and the appropriate esti- 
mating equation used for each one. Then the individual 
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manmonths are summed to give a total system development 
effort. The three equations are divided into size measure 
(lines of source code or words of object instructions) and 
the life cycle phase in which the estimate is made (Concept 
Formulation and all others) . If the estimate is to be made 
using the words of object instructions, the same equation is 
used in all life cycle phases. Similarly, for estimating 
large systems (more than 10,000 lines) using lines of source 
code requires the use of a different equation in the Concept 
Formulation Phase than in the other life cycle phases. 

The use of the different equations can be described as 
follows (A, B, and C refer to the three different 
relationships) . 



- J 

SOFTWARE 1 

DESCRIPTION 1 

1 


r 1 

1 LIFE CYCLE PHASE 
I CONCEPT j OTHERS 


1 

WORDS OF OBJECT CODE 1 

1 


1 1 

1 A 1 A 

1 1 


LINES OF SOURCE CODE | 

LARGE SYSTEM > 10K LINES ! 


1 1 

1 3 1 B 


SMALL SYSTEM > 10K LINES | 
1 1 


[ B I C 

1 1 



The forms of the estimating relationships are similar. 
Equations A and B are of the form; 

b 

sa = a I 

where 

an = aanmonths of development effort. 

I = either words of object code (A) or lines of 
executable source code (B) . 
a,b = Constants obtained empirically. 

Equation C has the form: 

d 1h 

aa = cl z f 

j 
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H here 

f = a set of paratneters describing the development 
environment. 

c,d = constants obtained empirically.... 



The following guidelines are presented for selecting the 
proper estimating relationship. 

• In Concept Formulation, if the size of the program in 
object code is known, use the object code estimators. 
They will give more accurate estimates of manpower 
requirements. 

• If accurate estimates of manpower requirements are re- 
quired in the Analysis and Design and subsequent phases 
of development, use equation B, in source code, for 
programs of I > 10,000 and equation C, in source code, 
fcr programs with I < 1 0,000. 

• For budgetary purposes, use the equation that gives the 
higher estimate. 

Development time is estimated using the equation 

1000 1 



D 



.667 

92.25 + 2331 



Where 

D = Reasonable development time in months 

I = number of delivered object instructions. 

This relationship was obtained using regression on data 
describing 74 development projects. The time estimate 

should describe ’’customary” distributing of effort over time 
that is, it should avoid extremes of project, time compres- 
sion cr expansion. 

It should be noted that a large portion of the documen- 
tation accompanying the description of the DAI estimating 
procedures is devoted to discussions of factors that are 
believed to influence the cost of software development. 
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These factors are classified according to aspects of soft- 
ware and its development environment. The factors are 
grouped according to the following "domains": 

• Requirements 

• System Architecture/Engineering 

• Kanagement 
Out puts 

Cost of Softwar e Developme n t 

The estimate of total development cost is based on 
several relationships that portion the cost into components 
that can be estimated by applying available ratios to other 
costs and factors such as overhead and administrative costs. 
By the proper use of relevant values for these factors the 
relationships can represent either goverment in-house costs 
or contractor development costs. A method is described for 
time phasing the expenditure that is said to satisfy the 
requirements of DoD Directive 5000.1. 

The procedure identifies costs rhat are incurred by the 
government during all phases of the software life cycle 
except Operation and Support. The total development cost 
includes : 

c = c + c + c 

CF 7AL FSD 

where 

C = Development Cost 

C = Conceptual Phase Cost 

CF 

C = Validation Phase Cost 

7AL 

C = Full Scale Development Cost. 

FSD 

Information is included that relates the government cost 
to the contractor's full scale development cost. This cost 
is the one developed by the formal software cost estimating 
pro cedure. 
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the cost of development is divided into primary and 

secondary costs, thus: 

C = C + C 
DPS 

where 

C = Cost of Develooment 
D 

C = Primary Cost (Manpower) 

P 

C = Secondary Cost (Computer, Documentation, Etc.) 



then, 

c = 

p 

where 

MM = 

c = 
e 

and 



Therefore: 



MM (C ) 
e 



Total Development Man-Months 
Average Labor Cost 



n 

C = Z C = kC • 
S i=1 i p 

C = (MM)C (i + k) 
D e 



w here 

k = Ratio of Secondary to Primary Costs (=. 075) 

The total software development cost (does not include 
government Conceptual and Validation Phase costs) includes 
the costs of: 

• Analysis 

• D^sian 

• Code 

• Code 

• Debug 

• Test and Checkout 

and is proportional to the total man-months of development 
eff ort. 



94 



Total Develcpment Han - Kont h s 

This is the primary output variable. It is the basis 
for the total development cost estimate and it is the value 
from which the distribution of effort by life cycle phase is 
derived. The hours include those directly related to the 
development of the software system. They include the direct 
hours needed for: 

Analysis - interpreting the system requirements and 
producing viable alternative system concepts 
Design - preparing detailed designs of the data processing 
system and the individual programs 

Coding and Debugging - writing individual modules and 
programs and performing individual tests 

Testing and Checkout - integrating the individual subsys- 
tems into a complete system and conducting prescribed 
tests on the entire system. 

The discussion of the model does not indicate the extent 
that support and management hours are included in the total. 
Also, there may be some question about the activities asso- 
ciated with concept development (e.g., is the test plan 
furnished by the government following the validation phase 
or is it developed as part of the project) . As in many cost 
estimating situations, the line between concept analysis and 
the evaluation of solutions to selected concepts is hazy. 

Although the DAI documentation and discussions with the 
authors indicate that the model includes integrated system 
testing, it appears that this effort is not included in the 
original SDC data which was the basis for the curve fits, 
(76^ of the SDC data points describe programs that do not 
interface with any other programs). 

Software D evelopm ent Time 

A nominal development time is presented that implies 
’•customary manloading". That is, the schedule does not 
reflect either crash projects or allow for unnessary delays. 
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Distr ib ution of De velopmen t Eff o rt 

The expenditure of time and effort associated with major 
project milestones is given for small projects (one level of 
supervision) and large projects (more that one level of 
supervision). The distributions are for nominal projects 
and do not allow for any possible acceleration or delay of 
the completion of the project,... 

Inputs 

Fro gr am size 

DAI has been very care fall to describe the size vari- 
ables which are the primary inputs to the estimates using 
the relationships. However, we should point cut that the 
respondents to the original SDC questionnaire were not so 
well directed and it may be necessary when analyzing the 
structure of the model as it relates to prediction accuracy 
that significant errors may have been introduced by this 
failure to be specific. The DAI model may not overcome what 
are inherent limitations in the data. 

The DAI procedure calls for several estimates in support 
of the DSASC process. It recognizes that the best estimates 
of program size are obtained later in the development cycle. 
It suggests, rhen, that the interpretation of the program 
size changes during the life cycle and that associated with 
the change are increases in estimating accuracy. The report 
describes how the knowledge of the size estimator changes 
during the life cycle and how this affects the estimating 
precision. The precision associated with the different size 
measures during the system development life cycle is as 
fol lows. 

Cede that is developed as part of the project but is not 
delivered to the customer is a source of variation in the 
estimate of the system size and must be considered. 
However, no guidance is provided for making any adjustment 
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other than citing that the SDC data showed delivered code to 
average 77 percent of the developed code with a standard 
error of 30 percent. 






SOFT^tfRE ESTIMATE 


WHEN 


SIZING BASIS j 


I % ERROR 


1. INITIAL PFCGRAM 


CCNCEPTliAL phase 


TOTAL OBJECT CCDE 


UP TO 200 as* 


BUDGETARY EST IMATE 








2. INC6PEN06NT PROGRAM 
V^L ICAT ICN COST 


VALIDATION PRIOR 
TO RFP RELEASE 


TOTAL OBJECT MINUS 
DATA AREAS 


UP TO 100 J 


ESTIMATE 








3. INCEP6N0ENT FSO 
COST ESTIMATE 


COMPLETION OF 
SYSTEM SPEC 
THROUGH PCR 


TOTAL OBJECT MINUS 
DATA AREAS WITH 
ADJUSTMENTS FOR 


UP TO 15 % 


A, UPDATE OF FSO 
CCST ESTIMATE 


POR THROUGH 
REMAINDER OF 
DEVELOPMENT 


TOTAL SOURCE CODE 


UP TO 50X 
IMPROVING 
TO ZERO AT 
COMPLETION 



♦THE /CTUAL K<Y BE 200 PERCENT OF THE ESTIMATED OR THE ESTIMATED MAY BE 200 
PERCENT OF THE ACTUAL, 



Allowance aust also be made for support software devel- 
opment especially when wording with new hardware. 

Tot al Object Words 

During the conceptual Phase when very little is known 
about the system to be developed, the initial estimate is 
made using the analyst's judgement (usually by analogy with 
previously developed systems, but other methods are 

possible) of the number of object words occupied by "ever 
program needed to run and maintain the system in the field". 
This measure is obtainable from listings of computer system 
routines that build executable programs from the output of 
the compiler. Taking values from systems similar to the one 
being planned can provide a basis for estimating the value. 
Care should be taken, however, when program overlays are 
involved. Also, extensive use of standard library routines 
can greatly increase the words of object program size and 
not be representative of a comparable increase in develop- 
ment effort. 



97 



Tot al Object Words Minus Da^ Areas 

The memory space occupied by an executable program is 
composed of locations containing instructions and locations 
reserved for the data upon which the program will operate. 
Sometimes the data storage areas are significantly larger 
than the area occupied by the actual instructions. DAI 
suggests that the effort required to develop the programs is 
more closely related to the size of the instruction space 
than to the size of the combined data and instruction 
storage. However, as in the case of the total object words, 
there is no evidence of this distinction being made in the 
original derivation of the estimating procedures. Also, 
there is no guidance provided on how to apply the additional 
information when preparing cost estimates. Some computer 
system executive processing routines provide this informa- 
tion. However, many don’t and, therefore, it would be very 
difficult to obtain comparable historical information to 
guide new estimates. 

New O bject Word s Minu s D^a Are as 

Only the writing of new code contributes to the software 
development effort {if code written to modify existing 
modules is counted as new code) . To account for the work 
done to adapt existing code to a new system, which includes 
analyzing the code and deciding how to modify it, any 
existing module that will result ie less than 50 percent 
utilization of existing code is considered to be entirely 
new . 

New Sou rce Line s 

Counts of new source lines written (whether in a higher 
order or machine oriented language) can be obtained from 
compiler listings, measuring card decks or text editors. It 
is one of the easiest measures of size to obtain. As in the 
previous case, modules containing less than 50 percent 
reused cods are considered to be new. 
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Devel o pm ent E nvironment 

For estimates made using lines of source code where the 
size is less than 10,000 lines, the estimating relationship 
includes a number of factors describing the development 
environment. These are included in the estimate when the 
indicated item is to be part of the development process..,, 
f1 Special Display 

f2 Detailed Definition of Operational Requirements 

f3 Change to Operational Requirements 

f4 Real Time Operation 

f5 CPD Memory Constraint 

f6 CPD Time Constraint 

f7 First SW Developed on CPU 

f8 Concurrent Developed on CPD 

f9 Time Share 7erus Batch Processing in Development 
flO Developer Using Computer at Another Target Computer 
f11 Development at Operational Site 

f12 Development Computer Different from Target Computer 
f13 Development at More than One Site 
f14 Programmer Access to Computer 

After analyzing the method used by DAT to obtain their 
estimating relationships and after comparing their defini- 
tions of input and output variables with rhe original 
sources of data, it is clear that there are discrepancies 
between the way the data are being applied and what they 
originally represented. DAI does not explicity justify 
their approach but their presentation of the estimating 
procedure does give consideration to errors arising from 
differing definitions of the variables, 

DAI seems to be saying that consistent use of the esti- 
mating procedures regardless of how they were obtained will 
produce results with at least a predictable error. That is, 
knowing the range of error that can occur because of 
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differences in definitions and ability to predict the input 
variables will, when applied to the given estimating rela- 
tionships, produce estimates with precision that is in 
accordance with previous experience. DAI further substanti- 
ates the approach of throwing all the error into the ability 
to define the input by presenting standard error values for 
the size variables at different times in the life cycle. 

E. FARE AND ZAGORSKI MODEL 
Des cr iption of the Mode l 

System Development Corporation completed several 
projects for the Air Force, Electronic Systens Division in 
which they attempted to develop methods for predicting the 
cost of software development. The Farr and Zagorski model 
represent an intermediate stage in the program. 

Osing historical data from internal projects and from 
other organizations, the SDC team systematically tested over 
100 variables to learn if they were satisfactory predictors 
of program design, coding and debugging effort. 

Farr and Zagorski published three equations which were 
determined to be the best predictors tested up to that •‘■ime. 
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Def in it ion s of Inputs 

X = number of instructions in original estimate (in 

1 

thousands) 

X = subjective rating of information system complexity 
(scale 1-5) 

X^ = number of document types delivered to customer 

X = number of document types for internal use 

4 

number of computer words needed to store program 

data (log ) 

10 

number of instructions in delivered program (in 
thousands) 

number of mun-miles for travel (in thousands) 

system programmer experience (average of total years 
of experience with the computer, language, and 
application) 

number of display consoles 

percent of instructions new to this program (not 
re-used from preveios versions) 

number of instructions to perform decision func- 
tions (in thousands) 

number of instructions to perform nondecision 
functions (in thousands) 

programmer experience with this application (aver- 
age number of years). 

?. WCLV2ST0N 

Model 

Estimates of routine size are converted to costs using 
cost per instruction values than are functions of the 
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routine type and complexity. The costs are fully burdened 
and when summed for all the system routines represent the 
total system development cost. Development extends from 
analysis and design through operational demonstration. A 
matrix of ratios is used to allocate the total cost to 7 
phases with each phase divided into up to 25 activities. 
This allocation is compared from the standpoints of staff, 
schedule, and general credibility. 

The model, then, is a combination of formal algorithm 
and judgement. It has been used successfully at TRW. As 
described by Wolverton, it features a data base of histor- 
ical data that provide the necessary cost per instruction 
and allocation values. The procedure is adaptable to any 
new environment by creating a new data set representing 
local definitions of phases and activities and burdened cost 
conventions. In fact, Wolverton cautions that the given 
values cf cost per instruction are for illustration and 
users should prepare their own values. 

TEW has computerized the maintenance of the cost data 
base and the allocation process. Given the inputs of size 
and complexity, the system calculates the cost allocations 
and facilitates any subsequent adjustments. Since most 
models are used in a similar manner, even if the procedure 
for using the model does not say so, there should be no 
compromise of the model’s performance if the evaluation is 
based on a single estimate of costs. Other adjustments that 
are necessary to execute the model in different environments 
will be discussed later. 

The estimating procedure begins by identifying all the 
routine comprising the system. Each routine size, caregory, 
and relative degree of difficulty are estimated by knowl- 
edgeable persons. 
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The categories that have "stood the test of usage" at 
TRW are: 

• Control routine 

• Input/Output routine 

• Pre or Post algorithm processor 

• algorithm 

• Data Management routine 

• Time-Critical processor 

Relative difficulty is indicated by six levels depending 
on whether a routine is Old or New and then by simply: Easy, 
Medium or Hard. 

....Multiplying the cost per instructin for each 
routine by its number of object instructions and summing the 
products for all the routines yields the estimated total 
development cost. 

The development cost is allocated to the following 7 
phases using proportions for each phase that were obtained 
from the historical data base. 

i. Performance and Design Requirements 

B. Implementation Concept and Test Plan 

C. Interface and Data Requirements Specification 

D. Detailed Design Specification 

S, Coding and Auditing 

F. System Validation Testing 

G. Certification and Acceptance Demonstration 

Then, the cost for each phase is divided into up to 25 
act ivit ies . . . . 

A matrix of computer hours by phase and software type is 
used to estimate computer usage costs for development. 

O ut puts 

Dev elopmen t Cost 

The given cost values are in 1972 dollars. The value of 
cost results from applying "bid rates" to labor costs which 
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accounts for fringe benefits, overhead, administrative 
expenses and other indirect costs. Documentation and travel 
costs are added to the labor costs. Finally, estimates are 
made of the computer costs. The distribution of the costs 
by phases and activities were described above. 

Development Effort 

Cost is not a suitable basis for evaluating the 
different software estimating models because of differences 
in accounting practices among organizations and because of 
inflation. Therefore, the Molverton cost values were 
converted to manmonths using an average burdened cost per 
manmonth of $4600. This value was obtained from the article 
describing the TRW estimating procedure and, therefore, 
should be representative of the cost environment. 

Inputs 

QMlSl Ins t ructi ons 

The model input measure of size is applied to programs 
or routines. These are taken to be functionally distinct 
elements of a system that would be developed independently 
then intergrated into the delivered system. It is expected 
that these would be independently operable using test 
drivers. Such a definition is consistent with industry 
usage. The reference documenx is not specific on this 
point. The term in struct ions'* is taken literally. This 
means estimating the number of instructions in the execu- 
table program exclusive of any data areas. The number of 

instructions may be estimated by obtaining the words of 
memory occupied by the executable code and dividing by the 
average words per instruction. 

S oftw are Categories 

Each routine is characterized according to one of the 
following categories: 
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C. Con t rol R outine . Controls execution flow and is 
ncntime critical, 

I* Input/Output Routine, Transfers data into and out of 
computer 

P, Pre~or Po^ Algor ith m Proces sor. Kanipulates data 
for subsequent processing or output, 

A- Al gorithm . Performs logical or mathematical opera- 
tions. 

D* Manage ment Routine. Manages data transfer 

within the computer. 

T- Time Criti cal Pr o c es sor . Highly optimized machine 
dependent code. 

Degre e of D ifficulty 

Wolverton indicates that any numeric representation of 
complexity may be used. The main purpose is to distribute 
the cost per instruction values over the range of experience 
for a given category of software. He suggests a simple 
designation of old or new, depending on a loose interpreta- 
tion of the amount of reusable coda, and easy medium or hard 
compared with other programs in the same category. 
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AP PENDI X B 

ANA LYSIS SpFTff ARE MODELS BY W0L7ER T0N 

A. INTRODUCTION 

R. W. Wolverton studied several sofrware cost estimating 
models while working for TRW in an effort to determine that 
model which would best predict those costs associated with 
software development. This appendix consists of excerpts 
from his review of some of these models. 

B. BOEING COMPUTER SERVICE COST MODEL 
Purpose 

Boeing Computer Services (BCS) designed this analytical 
model to provide an estimate at proposal preparation time of 
the number of manmonths needed to design a computer program. 
BCS developed the model for use as an internal guideline to 
cross-check the traditional bottom-u? estimate made by their 
proposal manager. The bcttcm-up estimate, with its WBS was 
tacitly assumed to be more accurate and the model served to 
aid in independently justifying the proposal manager's 
estimate. 

While under contract to RADC, Boeing used their cost 
model to test several hypotheses about the cost benefit 
attributable to modern programming practices (Black, et al. , 
1977; Black, 1978). BCS derived and calibrated their model 
against internal software projects using traditional 
programming practices. This model has received wide-spread 
exposure as part of the DOD's embedded compuner resources 
DSARC guidebook (DeRoze, 1977). 
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Input 

a. 



b. 



Size of computer software in units of delivered 

source statements. The BCS model assumes that a 
"statement” is one fully checked tested, and docu- 
mented statement coded in a selected language. The 
choice of high-level language can have a significant 
effect on the development cost, but ordinarily affects 
only portions of the total task. 

Type of software to developed. BCS observed some 
combination of five generic functions. Each "type” 
has its own group productivity rate. The specific 
software type and productivity rates are as follows: 

• Mathematical Opns 



Report Generation 

Logic Operations 

Signal Processing, 

Data Reduction 

Real-Time, Executive or 
avionics interfacing 



6 manmonths/ 

1000 source statements 

8 manmonths/ 

1000 source statements 

12 manmonths/ 

1000 source statements 



20 manmonths/ 
1000 source stat 



ements 



UO manmonths/ 

1000 source statements 



The 


decreasing productivity 


is caused by the 


increasing complexity of the 


type of software being 


developed. 




Tasks to be accomplished in 


the computer software 


development, are distributed 


by the BCS model as 


foil 


ows : 






Task 


% Total Cost 


# 


Requirements Definition 


5 


• 


Design and Specification 


25 


« 


Code Preparation 


10 


• 


Code Checkout 


25 


• 


Integration and Test 


25 


• 


Svstem Test 


10 
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The numerical distribution opposite the task does not 
consider reuse and sophisticated debug tools. The 
distribution is not necessarily a rectilinear function 
of time, but is intended to be used as a guideline for 
schedule preparation. Documentation is not included 
in this estimating procedure and must be estimated by 
seme other method, not defined in the model itself, 
and added to the manpower estimates, 
a. Adjustment of the labor estimates is accomplished by 
means of table lookup multipliers given in Table VIII. 
All terms are assumed by the model developer to be 
self-explanatory. 

Computatio nal Procedure 

Using this model. Program Office personnel would esti- 
mate how much of the total OFP software is closest repre- 
sented by one of the five generic types of software. In 
practice, estimating the size and type would be based on 
past experience with similar projects that have been 
adjusted to the new application. Sverything associated with 
the manmonth estimate flows from this first step. 

Table VIII provides the estimator with phase-sensitive 
multipliers for adjusting the baseline manmonths estimate. 
The user should be alert to stringent sizing or timing limi- 
tations. These effects should be estimated by some other 
procedure (not given) and added to the baseline manmonth 
estimate. 

After individual labor costs have been adjusted by use 
of the table, the 3CS model sums up the individual estimates 
and arrives at the total labor cost for the project. 
Computer time is estimated by a rule of thumb that approxi- 
mately three hours of stand-alone computer time will be 
spent per manmonth. 
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Out pu t 

The fundamental output is the total manmonths estimated 
for the planned software project. In turn, the total 
manmonths are spread over a six stage development cycle from 
requirements definition to system test. 

Although acceptable engineering accuracy in estimating 
total manmonths is claimed by the model developers for 
traditional programming practices (c. 1970) , the examples of 

estimating accuracy are not encouraging for modern program- 
ming practices. In other words, the intent of the BCS model 
is to show how much a new project would have cost if done 
the old way. Presumably the lower observed cost is due to 
the new design methodologies. Output results for five 
projects given by BCS are shown in Table IX. A guideline is 
to try this model on some historical data and compare the 
accuracy of predicted versus actual manmonths before 
attempting to use it in practice,... 

TABLE IX 

Forecasted versus Actual Costs for the BCS Model 



Pro jecti 


Forcast 1 

1 Total ManmonthsI 


j Actual 1 

1 Total ManmonthsI 


1 Forecast/Actual 
i Ratio 


A 


419.7 


71.0 


5.9 


B 


2288. 5 


99 1.7* A 


2. 3 


C 


51, 5 


43. a 


1. 2 


D 


3298. 7 


514. 8*^ 


6. 4 


S 

1, . 1 . ■ , 1 


7. 9 

1 , ' 


7. 3 

L — — J 


1. 1 

1 , . 1 



Contains some est imare- to-complet e data, along with 
actuals 
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IBM WALSTON-FSLIX COST MODEL 



Pur pose 

Walston and Felix conducted experiments on 60 completed 
software development projects in their search for a method 
of estimating programing productivity (Walston-Felix , 1977). 
The purpose of this effort was to estimate the rate of 
production of lines of code by projects, as influenced by 
project conditions and requirements. 

Five specific objectives of the Waist on-Felix model are 

a. To evaluate improved programming technologies. 

b. To provide support for proposals and contract 
performance. 

c. To gather historical records of the software devel- 
opment work performed. 

d. To provide programming data to management. 

e. To foster a common programming terminology. 

Completed projects in the Walston-Felix data base ranged 

in size from 4,000 to 467,000 delivered source lines of code 
and in effort from 12 to 11,753 manmonths. Applications 
programs included realtime process control; inreractive, 
report generators; data base control; and message switching 
programs. Twenty-eight different high-level languages and 
66 different computers are represented in their data base. 
This is an outstanding example of a closed-form model 
obtained by linear regression analysis of a large and 
diverse body of actual software projects. Some further 
technical work is required to extend the findings of Walston 
and Felix to the specialized needs of avionics software. 
The additional work to be done in calibration of the model 
will be discussed in. ..Comp utational Procedure. 

Input 

a. Number of lines of delivered source code. Source 
lines are 80-character source records provided as 



input to a language processor. Job control languages, 
data definitions, link edit language, and comment 
lines are included. Reused code is not included. 

b. From the raw data provided by the 60 projects, a set 
of 68 variables was selected for analysis to find 
which ones were significantly related to productivity. 
Twenty-nine of the variables showed a significant 
correlation with productivity and have been retained 
for use in estimating.,.. 

c. ....The model user is asked to answer a multiple- 

choice question in his response to the statement: Oser 
participation in definition of requirements is: none, 

some, much. In the origional analysis the mean 
productivity was computed for the 60 completed 
projects for which no user participation was reported 
and found to be 4S1 DSL/MM. The mean productivity for 
all projects that reported some user participation was 
267 DSL/MM, and the mean productivity for those 
reporting much user participation was 205 DSL/HM. The 
absolute value of the change in productivity from no 
user participation to much user participation is found 
to be 286 DSL/MM 

C om putatio nal P rocedure 

The Walston-Pelix cost model can aid Program Office 
personnel in estimating five project parameters: produc- 

tivity, schedule, cost, quality, and size of the software 
product to be delivered. One difficulty is in identifying 
and measuring independent variables that can be used to 
estimate the desired variables, such as estimating the size 
of the software product to be delivered. We take the point 
of view that the size of the software product to be deliv- 
ered can be independently, albeit with difficulty, estimated 
from the internal historical data base associating avionics 
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function with size (Battelle# 1978) or avionics function 
with software requirements (Heninger, et al., 1978). 

Productivity is a significant variable in all software 
estimating processes. Programming productivity is defined 
here as the ratio of the delivered source lines of code 
(DSL) to the total project effort in manmonths (HM) required 
to produce the delivered product. Total manmonths covers 
the management, administration, analysis, operational 
support, documentation, design, coding, and testing effort 
expended in the development phase. Analytical results are 
derived at start of work, PDR, midway through software 
development, at acceptance test completion, and every three 
months during the service or maintenance phase. 

The 29 variables. ..are combined into an index based on 
the effect of each variable on productivity from previous 
analysis. The productivity index is computed as follows: 

29 

I =r 5? X 
i=1 i i 



w here 
I 
w. 

A 



(PC) . 



1 



productivity index for a project 
question weight, calculated as 0.5 



log^^POi 



productivity change indicated for a given 
question i.. . . 



X = question response (+1, 0, or -1), depending on 

i 

whether the response indicates increased, nom- 
inal, or decreased productivity. 

....The data set is analyzed by ordinary least squares 
and the standard error of estimate, or standard deviation of 
residuals, is shown as dashed lines. In nhe data sample 
studied, the productivity index ranged from -4 to +4 
(private communication with C. Walston). The Air Force 
model user would determine his own productiviny index for a 
single project by answering the 29 questions. .. and by 
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m 




calculating I according to the above formula. He then 
multiplies his average productivity for all past avionics 
software in his data base by the productivity index for the 
acquisition at hand. 

If the Program Office has a historical data base of many 
projects, the total effort can be determined by a least 
squares fit and the regression equation from the Program 
Office's own internal data analysis at the point I = 0, 

DSL/MM = 274, using the coordinate system.... A statis- 
tical analysis program such as the Statistical Package for 
the Social Sciences (a product of SPSS, Inc.) would be 
helpful. SPSS will also provide other descriptive statis- 
tics such as the standard error of the linear regression 
line. ... 

The statistics. . .are given by medians and guartiles 
because of the variability in the measurement data. Note 
that the median productivity (I = 0) is 274 DSL/MM. The 
median for the size of the delivered software product is 
20,000 lines; 50 percent of the projects reported that the 
size of their delivered code ranged from 10,000 to 59,000 
lines. Resources for project development are shown. The 
error detection results are for the distribution of errors 
reported during the development period.... 

The amount of calendar time to allow for the development 
of software is difficult to express from a closed-form 
model. However, the equation for project duration in months 
as a function of total effort in manmonths was found to be; 



0. 35 



M = 2.47 S 



where , 
M 



duration in months, for full-scale development 

full-scale development. 



S 



effort in manmonths, for 



iH 








From the data collected for service projects, certain 
descriptive statistics were calculated.... The interpreta- 
tion is the same as before: median data and quartile data 

are presented due to the scatter in the raw reports. No 
predictive relationships are given for service projects. 

Documentation, as defined in this model, consists of 
program functional specifications and descriptions, users* 
guides, test specifications and results, flowcharts, and 
program source listings that are delivered as part of the 
documentation. To a close approximation, the least squares 
equation for the number of pages of delivered documentation 
varies directly as the number of lines of source code; that 
is 

1.01 

D = 49 L 

where, 

D = pages of documentation, including source listings 
L = thousands of source code lines 

Outpu t 

The major outputs available to the mcdel user are as 
follows: 

a. Total effort in man months required to produce the 
lines of source code. 

b. Duration of project in months. 

c. Use of improved programming technologies expressed as 
a percentage of code developed using each technique. 

d. Estimated productivity of project as influenced by 
project environment and requirements. 

e. Pages of documentation for the intended project, 
including pages of source listings delivered as part 
of the documentation requirements. 

f. The results do not support answers to certain 
project attributes implied by the data coeffi- 
cient s. .. because of cross-correlation effects 



(i.e.. 



the individual attributes are not statisticlly inde- 
pendent). For example: 

1. Chief programmer team. 

2. Top down development. 

3. Structured programming. 

4. Design and code inspections. 

The contribution of each attribute could not be taken 
individually because in the definition of chief 
programmer team the other techniques are implied, 
g. Other descriptive statistics can be inferred from 
study of the report itself; for example, the cost of 
computing time and the average number of people (total 
manmonths of effort divided by the duration) as a 
function of the total effort. The responsibility of 
relating the lines of executable assembly code to 
lines of delivered source code rests with the model 
user.... A scaling law for the Walston-Felix model can 
be derived from internal avionics historical data. 

D. PDTNAM'S SOFTWARE LIFE CYCLE COST HODEL (SLIM) 

P ur po se 

A descriptive cost model, coupled with informed opinion, 
will aid in answering top-level management questions about 
the development of OFP software. Descriptive statistics 
associated with expected OFP software cost, development 
time, manning levels, and perturbations about these esti- 
mates are significant inanagement interests at pre-3F? time. 
The Air Force can specify a useful lifetime, say 10 years, 
and obtain a quantitative cost estimate of the OF? software 
life cycle subject to the assumptions of the model. 



Input 

Three input parameters are required to calibrate this 
model's technology constant (Ck) for avionics applications. 
The F-111 data point... was the basis for this calibration. 
The three data points are: 

a. Number of delivered lines of executable source code, 
not including comments; 22,100. 

b. Number of manmonths for developing software: 805. 

c. Number of calendar months for developing software: 33. 

The user is prompted for all inputs by the EDITOR built 

into the SLIM cost model. Seventeen on-line inputs required 
for this model are as follows: 

a. Enter title of software system. Avionics, F-111 

b. Enter start date (NHY Y) . 0174 

c. Enter the fully burdened labor rate ($/m) at your 
orgainization. 60000 

d. Enter the standard deviation of your labor rate 

($/MY). 6000 

e. Enter the anticipated inflation rate as a decimal 
fraction. 0.065 

f. Enter the proportion of development that will occur in 
on-line, interactive mode. 0 

g. Enter the proportion of the development computer 

that is dedicated to this system development effort. 

0.2 



h. 


Enter the 


proportion of the system 


that will be 




coded in a 


HOL. 0 




A 

■Jm • 


Enter the 


number corresponding to 


the primary 




language to 


be used. (Twelve choices 


are given.) 10 




= assembly 


level language. 




j- 


Enter the 


number corresponding to the 


type of your 



system. 1 

1. Real-time or time critical system 

2. Operating system 



3. Command and control 

4. Business application 

5. Telecommunication and message switching 

6. Scientific system 

7. Process control. 

k. Choose the response below which best describes your 
system, 2 

1, The system is entirely new, with many interfaces, 
and must interact within a total management infor- 
mation system structure. 

2, This is a new stand-alone system. It is simpler 
because the interface problem with other systems 
is eliminated. 

3, This is a rebuilt system with large segments of 
existing logic. The primary tasks are recording, 
integration, interfacing, and minor enhancements. 

4, This is a composite system made up of a set of 
independent subsystems with few interactions and 
interfaces among them. Development of the inde- 
pendent subsystems will occur as a considerable 
overlap. 

5, This is a composite system made up of a set of 
independent subsystems with a minimum of interac- 
tions and interfaces among them. Development will 
occur in parallel, 

l. Enter the the proportion of memory of the target 
machine that will be utilized by the software system. 
0.85 

m. Enter the proportion of real-time code. 1 

n. Below is a set of modern programming techniques 
that may be used on a software development project. 
Beside each are three possible responses indicating 
the degree of usage on your system, 1 
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Technique 




Response 


Structured Programming 




1) 


< 25 % 






2) 


25 - 75 % 






3) 


> 75 % 


Design and Code Inspection 




1) 


< 25 % 






2) 


25 - 75 % 






3) 


> 75 % 


Top-down Development 




1) 


< 25 % 






2) 


25 - 75 % 






3) 


> 75 % 


Chief Programmer Teams 




1) 


< 25 % 






2) 


25 - 75 % 






3) 


>75? 


o. Below are two indicators 


of 


personnel that 


impact the cost and time to 


do a 


project. Beside e 


are three possible answere 


in di 


eating the degree 


experience. 2 








Personnel Experience 




Response 


Overall Skill and Qua lifica 


tion 


1) 


Minimal 






2) 


Average 






3) 


Extensive 


With Development Computer 




1) 


Minimal 



?. Enter sizing information in one of two forms: 

1. An overall range of sizes, or 

2. Ranges of size on a module- by-module basis. 

Enter 1 or 2 to indicate how sizing data should be 
entered. 1 
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q. Enter the lowest possible and highest possible size in 
source statements. 18100, 26100 
C omputatio nal Pro c edure 

Total effort can be determined from the software equa- 
tion developed by L. H. Putnam (Putnam, 1978; Putnam and 
Fitzsimmons, 1979). The software equation is modified by 
the environmental input parameters, items f through o. The 
software equation is: 



where , 

s 

s 



c 

k 



K 



d 



K/t 



d 




C K 
k 



1/3^ 4/3 

'd 



number of delivered lines of executable source 
code, net including comments 

a state of technology constant; previous exper- 
ience with computer response times and pro- 
gamming practices gives; 

C = 754 for avionics, assembly- level language 
k 

- 4984 for "1973-style” arbitrary develop- 
ment 

C = 10040 for "1979-style" structured develop- 
k 



ment. 

Rayleigh /Nords n life cycle effort parameter in 
units of manmonths or manyears 

Rayleigh /Horde n time parameter. Time at which 
peak manpower nominally occurs for large soft- 
ware projects. Hathematically, it is the peak 
of the curve. 





2 

d 



system difficulty, or ratio of total effort to 
development time squared. 
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The software equation is used to obtain engineering 
quality estimates during the early phases of a software pro- 
ject. The software equation is solved using a gradient con- 

3 

straint, K = VD t , where the magnitude of the difficulty 

d 

gradient is empirically found for a particular development 
environment, Monte Carlo simulation is used to generate 
descriptive statistics associated with the effort, develop- 
ment time, and development cost. The standard deviations 
are used in calculating risk profiles. 

The effort, time, and cost point estimates can be 
presented in the form of probability plots assuming a gaus- 
sian distribution. All that is needed is an extimate of the 
expected value (plotted at the 50 percent probability level) 
and the standard deviation (plotted offset from the expected 
value at the 16 percent probability level) to generate the 
probability line on ordinary probability paper. Then one 
can determine for example, that there is a 90 percent prob- 
ability that the software development will not take more 
than x-manmonths of effort. When repeated for all prob- 
ability levels of interest, one has a risk profile for that 
estimate. 

The tradeoff law can be obtained from the software equa- 
tion by solving for K, With a Monte Carlo simulation for 
generating variances for K and td one can perform a tradeoff 
analysis, pick a reasonable effort (or cost) time combina- 
tion and complete the sensitivity analysis. The value of 
simulating several thousand Monte Carlo runs is that it 
produces a measure of the variation in effort and develop- 
ment time, or the risk profile. Knowing the sansinivities, 
the Air Force PM can use in effectively in planning and 
contracting so that the risk level is always within accep- 
table range. Examples of this procedure are given in the 
COMPSAC 77 tutorial (Putnam and Wolverton, 1977). 
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Output 

Three options are available to the user; calibrate, 
editor, estimate. The option chosen for this illustration 
was "estimate.” A file is built from the previous input 
data, and an on-line comment shows that the input data check 
was acceptable. The structure of the on-line output is 
shown below: 

a. Summary of input parameters: table of all inputs. 

Annotated comment shows Ck, the technology constant, 
was separately computed to be 754. 

b. Simulation: system cost summary is given as follows: 



Mean Std Dev 



System Size (STMTS) 


22100.0 


1333.0 


Minimum Development time 
(Months) 


34.8 


1.2 


Development Effort (Manmonths) 


891.0 


106.9 


Development Cost (x $1000) 
- Unrnflated dollars 


4461.0 


711.0 


- Inflated dollars 


4887.0 


787.0 



Sensitivity profile for 


minimum 


time 


solution 


(i.e. , expected 


values of time 


effort, and 


cost for 


the whole size 


profil e) : 










Source 
St atements 


Months 


Man- 

Months (X 


Cost 

S1000) 


-3 SD 


18100 


31.9 


525 


2627 


-1 SD 


2076 7 


33. 9 


763 


3814 


Most Likely 


22100 


34.8 


891 


4461 


♦ 1 SD 


23433 


35.6 


1034 


5172 


+ 3 SD 


26100 


37.3 


1331 


6657 



Where SD = Standard Deviation 
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d. A cross-check with data from other systems of the 
same size for the most likely estimates is given. As 
compared with the HA DC data base (which is a mixture 
of software projects), the remarks show less than 
normal productivity for avionics OFP software. This 
is to be expected. 

e. An on-line information note gives the user 14 options 
for the remaining output; several of these will be 
given to show the management parameters available. 

f. Linear program; this function uses the technique of 
linear programming to determine the minimum effort 
(and cost) or the minimum time in which a system can 
be built. The results are based on the actual 
manpower, cost, and schedule constraints of the user, 
combined with the system constraints provided earlier. 

1. Enter the maximum development cost in dollars. 
4500000 

2. Enter maximum development time in months. 36 

3. Enter the minimum and maximum number of people 
allowed on board at peak manloading time. 15, 40 







Time 


Effort 


Cost 

(X $1000) 


Minimum 


Cost 


36.0 Months 


778 MM 


3892 


Minimum 


Time 


34.3 Months 


889 MM 


4446 


g. A tradeoff 


an alysis 


within these 


limits 


is shown 



below. 
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Time 


Banmo nths 


Cost (X $1000) 


34. 8 


88 9 


444 6 


35.0 


869 


4345 


35. 2 


84 9 


4247 


35. 4 


830 


4152 


35. 6 


81 2 


405 9 


35. 8 


794 


3970 


36. 0 


77 8 


3892 


Front end 


estimate: recall 


that the SLIM model 


assumes that 


the estimated time 


length is from logic 



design. Therefore, a separate front end estimate is 



equired, as fellows: 




Ti me 


( months) 


Eff' 


ort 


(«M) 




(L) 


(2) 


(H) 


(L) 


(E) 


(H) 


Feasibility 


• 

CO 


CD 

( 


9.6 


9 


35 


61 


Study 














Functional 


10.4 


11.6 


12.8 


25 


50 


75 



Design 

Note: L = Low, E - Expected, H = High 
i. Manloading: The table shows the mean projected effort 

and associated standard deviations required for devel- 
opment. The input parameters are 





Mean 


Std Dev 


Development Effort (Manmonths) 


891.0 


106.9 


Development Time (Honths) 


34.3 


1.2 



12a 



People/ Cumulative 

Time Month Std Dev Hanmonths 



Cumulative 
Std Dev 



Jan 


74 


2 


0 


2 


0 


Feb 


74 


5 


1 


7 


1 


Mar 


74 

> 


9 

• 


1 

• 


16 

« 


2 

• 


« 

Oct 


» 

» 

76 


• 

• 

17 


• 

• 

2 


• 

• 

377 


• 

• 

105 


Nov 


76 


15 


2 


893 


107 


Dec 


76 


7 


1 


900 


108 



(This distribution of 36 rows is essentially a 
Rayleigh distribution over the calendar period of 
performance, with integer values for all entries.).... 

o. Other primary outputs from the Slim cost model 
include: 

1. Code production: calendar time versus cumulative 

source statements 

2. Computer usage: calendar time versus CPU hours 

3. Documentation: expected number of pages of docu- 

mentation 

4. Design-to-cost: SLIM has provided its best esti- 

mate of the minimum time and corresponding maximum 
effort ( and cost) to develop your system. A 
greater effort would result in a very risky time 
schedule. However, if a lower effort is specified 
(within reasonable limits), development is s*ill 
feasible as long as more time is allowed. 

Entered desired effort in manmonths. 305 
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Mean 


Std Dev 


New Development Tims 
New Development Cost 


(Months) 
(X $1000) 


35.7 

$4025 


1.2 

488.0 



5. The original file is updated with these new param- 
eters, and the user can run manloading and cash 
flow or life cycle to see how these savings can be 
realized. This can be used interatively to match 
some projected benefit stream and get the project 
approved. (Connect time was about 37 minutes to 
run SLIM, at a cost of about $25) 

In summary, the SLIM model is a descriptive, macro-level 
cost estimating tool applicable to OFP software, provided 
that its technology constant (Ck) is calibrated from valid 
historical OFP project data : number of delivered lines of 

executable source code; number of manmonths from project 
start to software acceptance; and number of calendar months 
for the development. This step and its consequences must be 
understood by the user. SLIM composes the feasibility study 
and functional design as a separate front-end estimate which 
must be added to the initial cost estimate. Labor mix and 
work breakdown structure information is not given. 
Resources are allocated against time (spread by a Rayleigh 
distribution), but not against function (e.g., analysis and 
design, code and debug, and test and integration) . All 
statistical parameters are assumed to be normally distrib- 
uted for mat hmematical tra ctability . This assumption may 

contribute to the extreme sensitivity be-ween minimum cost 
and minimum time as shown in item f, linear program example; 
i.e., a 3 percent change in calendar time (from 36 to 34.8 
months) corresponds to a 1 4 percent change in cost (S3892K 
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to SU446K) . All mathematical expressions used 
tational procedure are continuous functions; 
model will always produce a calculated estima 
all models, this estimate must be tested agai: 
and human insight. 



in the compu- 
therefore the 
e. As with 
St experience 
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AP PENDI X C 



SOPPORT ING D^A MS CURVES 

TABLE X 
Project A Data 



Actual 

Manmths 


Time 

Mths 


Predicted LC 
Manmth s 


Predicted Maintenance 
Manm ths 


«.9600 


1 


2.60 05 




5.4600 


2 


5.0970 




7.1380 


3 


7.3683 




9.1380 


4 


9.3453 




1 1.9180 


5 


10.9544 




12.1380 


6 


12. 15 22 




13.1380 


7 


12.9205 




12.1380 


8 


13.2663 




1 1.9240 


9 


13.21 83 




15.2690 


10 


12.82 35 




13.2800 


11 


12.1413 




9.8460 


12 


11.2388 




8.3077 


13 


10.1846 




10.8460 


14 


9.04 46 




6.8460 


15 


7.8778 




5.8460 


16 


6.7342 




5.8460 


17 


5.55 28 




3.0000 


18 


4.66 16 


1. 19124 


3.2800 


19 


3.77 80 


2.22748 1 


2.3400 


20 


3.0101 


2.98636 1 


4.0000 


21 


2.35 83 


3. 40398 


3.0000 


22 


1.8174 


3.47740 


2.0000 


23 


1 .3778 


3.26075 


2.0000 


24 


1.0278 


2. 84229 


2.0000 


25 


0.7545 


2. 32054 


2.0000 


26 


0.54 51 


1.78318 


2.0000 


27 


0.3877 


1. 29398 


2-0000 


28 


0.27 15 


0. 88884 


1.5000 

1 


29 


0.1871 


0.57894 

_ 1 
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TABLE XI 



Project B Data 



Actual 
Manmth s 


Time 

Mths 


Predicted LC 
Manmth s 


Predicted Maintenance 
a anm ths 


5.9 200 


1 


3.86 88 




5.9200 


2 


7.41 28 




7.8600 


3 


10.3523 




13.4200 


4 


12.48 88 




15.. 80 00 


5 


13.72 64 




15,5800 


6 


14.0751 




14.3400 


7 


13.63 63 




13.1800 


8 


12.5768 




12.0200 


9 


11.0966 




5-0000 


10 - 


9.3971 




4.3333 


11 


7.6564 


1.76084 


2.7500 


12 


6.01 21 


3.32648 


4.5556 


13 


4.5561 


4.53730 


4 . 4722 


14 


3.3355 


5.29599 


5. 4 1 67 


15 


2.36 10 


5.57900 


5.5000 


16 


1.61 69 


5. 43158 


5.6111 


17 


1.07 19 


4.94937 


3.7778 


18 


0.6882 


4.25312 


3,8889 


19 


0.42 80 


3.46350 


2.7778 


20 


0.25 80 


2. 68174 


1,5833 


21 


0.1508 


1.97898 
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TABLE XII 



Project C Data 



Actual 

flanmths 


Time 

Mths 


Predicted LC 
Hanmth s 


Predicted Maintenance 
Hanmths 


6.0 


1 


2.82 13 




7.5 


2 


5.51 54 




7.0 


3 


7.96 44 




8.5 


4 


10.0687 




12.5 


5 


11,7533 




12,5 


6 


12.9721 




13„0 


7 


13.7095 




14,0 


8 


13.97 89 




14.0 


9 


13.31 91 




14.0 


10 


13.28 88 




13.0 


11 


12.4601 




11,0 


12 


11.41 16 




11.0 


13 


10.22 21 




8.0 


14 


8.9650 




8.0 


15 


7.7044 




9,0 


16 


6.49 20 




3.0 


17 


5.3669 


0.64025 


2,0 


18 


4.3546 


1.25743 


2.0 


19 


3.46 92 


1.82983 


2,0 


20 


2.71 46 


2.33840 


2.0 


21 


2.0868 


2.76779 


2.5 


22 


1.57 64 


3.10709 


4.0 


23 


1.1704 


3.35022 


3.0 


24 


0.3543 


3.49602 


4.0 




25 


0.61 31 


3.54788 

1 
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TABLE XIII 



Project D Data 



Actual 

Manmths 


Time 

Mths 


Predicted LC 
Kanmth s 


Predicted : 
Manmths 


6.0000 


1 


3.35 85 




9.5200 


2 


7.1746 




8,5769 


3 


9.53 12 




9.6369 


4 


10.72 13 




9,6369 


5 


10.77 02 




11.1700 


6 


9.39 40 




10.2260 


7 


8.41 76 




5.. 2 8 00 


8 


6.68 23 




1,6800 


9 


4.9749 


0.66747 


2-4800 


10 


3,48 44 


1. 31 143 


3,0000 


11 


2,30 15 


1, 90977 


3.0000 


12 


1.43 16 


2,44297 


3.0000 


13 


0.3477 


2.89525 


5.0000 


14 


0.4733 


3.25522 


3.5000 


15 


0.25 10 


3.51641 


2.5000 


16 


0.1261 


3.67722 


3.0000 


17 


0.0601 


3.74074 


4-0000 


18 


0.0272 


3,71413 


2.0000 


19 


0.01 17 


3.60786 


3-0000 


20 


0,0048 


3. 43475 


3.5000 


21 


0.00 19 


3.20901 


2,0000 


22 


0.00 07 


2.94523 


2.7600 


23 


0.00 02 


2.65777 


3,0000 


24 


0.0001 


2.35956 


2-5000 


25 


0.00 00 


2.06207 


1-5000 


26 


0.00 00 


1.77470 


1.0000 


27 


0.00 00 


1.50475 


1.5000 


28 


0.0000 


1.25734 


1.5000 


29 


0.00 00 


1.03565 


1,0000 


30 


0.00 00 


0. 84109 


1.0000 


31 


0.00 00 


0.67365 


1-0000 


32 


0.00 00 


0.53218 


1,0000 


33 


0.00 00 


0,41475 


2-0000 


34 


0.00 00 


0.31892 
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TABLE XIV 



Combined Project A- D Data Normalized to td=1 



Actual Time Predicted LC Predicted Maintenance 

Manmths Mths Manmths Manmths 



4.9600 


0,100 


2.3128 


6,0000 


0,111 


2,5637 


6. 0000 


0.167 


3.8213 


5.4600 


0,200 


4,5433 


5.9200 


0,200 


4.5433 


7.5000 


0,222 


5.0151 


7. 1380 


0.300 


6.6 140 


7. 0000 


0,333 


7.2503 


9.5200 


0-334 


7.2692 


9. 1380 


0,400 


8,4568 


5.9200 


0.400 


8.4563 


8.5000 


0.444 


9. 1807 


1 1.9180 


0.500 


10,0167 


8.576 9 


0,501 


1 0-0307 


12.5000 


0.555 


1 0.7390 


12. 138 0 


0,600 


1 1,2541 


7. 8600 


0.600 


1 1,2541 


12.5000 


0.666 


1 1,8826 


9.6 36 9 


0.668 


1 1.8993 


13. 1380 


0.700 


12.1468 


13.000 0 


0-777 


1 2,5957 


13.4200 


0.800 


1 2.6900 


12.1380 


0,800 


1 2,6900 


9.6 36 9 


0.835 


12.7992 


14. 000 0 


0.888 


1 2.8875 


1 1. 9240 


0.900 


1 2-8 950 


1 1. 1700 


1.000 


1 2.7876 


15.269 0 


1.000 


1 2.7876 


14.0000 


1.000 


1 2.7875 


15.8000 


1.000 


1 2.7876 


1 3.2800 


1 .100 


1 2, 4049 


14.0000 


1.111 


1 2,3478 


10. 2260 


1 . 167 


1 2.0 167 


9.8460 


1,200 


1 1,7921 


15.5800 


1.200 


1 1.7921 


13. 0000 


1,222 


1 1.6313 


8. 3077 


1.300 


1 0.9993 


1 1. 0000 


1.333 


1 0.7069 


5.2800 


1.334 


1 0.6979 


10, 8460 


1 ,400 


1 0,0777 


14. 3400 


1.400 


1 0-0777 


1 1.0000 


1.444 


9.6444 


6.3460 


1.500 


9.0770 


1. 6800 


1.501 


9.0667 


8.0000 


1.535 


8.7 165 


5. 3460 


1.6 00 


3.0424 


13.1800 


1,600 


8.0424 


8,0000 


1.666 


7.3605 


2. 4800 


1 .668 


7-3399 


5.8460 


1,700 


7.0134 


9.0000 


1.777 


6-2456 


12.0200 


1.800 


6,0224 
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Table XIV continued 



Actual 

Manmths 


Tile 

Mths 


Predicted LC 
Manmth s 


Predicted Maintenance 
Hanmths 


3.0000 


1.800 


6.0224 


0.00528 


3. 0000 


1.835 


5.6893 


0.18457 


3.0000 


1.888 


5.2016 


0.46312 


3. 2800 


1.900 


5.0941 


0.52590 


3. 0000 


2.000 


4.2458 


1.04203 


2. 8400 


2.000 


4.2458 


1.04203 


2. 000 0 


2.000 


4.2458 


1.04203 


5.0000 


2.000 


4.2 458 


1.04203 


4.0000 


2. 100 


3.4880 


1.53892 


2.0000 


2.111 


3.4 104 


1.59202 


3.0000 


2.167 


3.0332 


1 .85663 


3.0000 


2.200 


2.8249 


2.00771 


4. 3333 


2.200 


2.8 249 


2.00771 


2.0000 


2.222 


2.6917 


2.10625 


2.0000 


2.300 


2-2 559 


2.44037 


2. 0000 


2.333 


2.0882 


2.57399 


5.0000 


2.334 


2.0832 


2.57797 


2.0000 


2.400 


1.7768 


2.82995 


2. 7500 


2.400 


1.7768 


2.82995 


2. 5000 


2.444 


1.5926 


2.98621 


2. 000 0 


2.500 


1.3 803 


3.17078 


3.5000 


2.501 


1.3 76 8 


3.17393 


4. 0000 


2.535 


1,2595 


3. 27772 


2.0000 


2.600 


1.0579 


3.45858 


4.5556 


2.600 


1.0579 


3.45858 


3. 0000 


2.666 


0.8810 


3.61804 


2.5000 


2.668 


0.876 1 


3.62249 


2. 0000 


2.700 


0.7999 


3.69053 


4.0000 


2.777 


0.6392 


3.83018 


4. 4722 


2.800 


0.5969 


3.86530 


2.0 00 0 


2.800 


0,5969 


3.86530 


3.0000 


2.835 


0.5370 


3.91294 


1. 5000 


2.900 


0.4 395 


3.98301 


4.0000 


3.000 


0.3 194 


4.04514 


5.416 7 


3.000 


0.3194 


4.04514 


2.0000 


3.167 


0.1 820 


4.03290 


5. 5000 


3.200 


0. 1 622 


4.01462 


3.0000 


3.334 


0. 1 001 


3.89258 


5. 611 1 


3. 400 


0.0782 


3.80710 


3. 5000 


3.501 


0.0531 


3.64877 


3.7778 


3.600 


0,0358 


3.46657 


2.0000 


3.668 


0.0272 


3.3290 1 


3. 8889 


3.800 


0,0 156 


3.04092 


2.7600 


3.835 


0.0 134 


2.96117 


2.7778 


4.000 


0-0065 


2.57596 


3.0000 


4.000 


0.0065 


2.57596 


2. 5000 


4.167 


0.0030 


2.18624 


1. 5833 


4.200 


0.0025 


2.11088 


1. 5000 


4.334 


0.0013 


1.31450 


1.0000 


4.501 


0.0006 


1 . 47364 


1. 500 0 


4.668 


0.0002 


1 . 17173 


1. 5000 


4.835 


0.0001 


0.91255 


1. 0000 


5.000 


0.0000 


0.69870 


1.0000 


5. 167 


0.0000 


0.52270 


1. 0000 


5.334 


o.ocoo 


0.38337 


1.0000 


5.501 


0.0 000 


0. 27572 


2. 0000 


5.668 


0.0 000 


0. 19449 



I j 
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TABLE X7 



NASA Project Data 



DATE 


MHRS 


MMTHS 


MYHS 


DATE 


MHRS 


MMTHS 


MYRS 


3/75 


593 


0.797 


0.067 


9/78 


450 


0.625 


0.05 1 


4/75 


653 


0.9 07 


0.074 


10/78 


450 


0.605 


0.051 


5/75 


773 


1. 039 


0.08 8 


11/78 


400 


0.556 


0.046 


6/75 


780 


1.083 


0.089 


12/78 


410 


0.551 


0.047 


7/75 


864 


1. 161 


0.09 8 


1/79 


510 


0.685 


0.058 


8/75 


9 29 


1.2 49 


0.106 


2/79 


420 


0.625 


0.048 


9/75 


953 


1.3 24 


0.10 9 


3/79 


370 


0.497 


0.042 


10/75 


1013 


1.362 


0.115 


4/79 


410 


0.569 


0.047 


11/75 


1006 


1.3 97 


0.115 


5/79 


390 


0. 524 


0.044 


12/75 


1037 


1. 394 


0.118 


6/79 


440 


0.611 


0.050 


1/76 


1061 


1.4 26 


0.12 1 


7/79 


670 


0.901 


0.076 


2/76 


877 


1.260 


0.10 0 


8/79 


520 


0.699 


0.059 


3/76 


1 1 50 . ? 


J 1.5 46 


0.13 1 


9/79 


580 


0.806 


0.066 


4/76 


1073 


1.4 90 


0.12 2 


10/79 


440 


0.599 


0.050 


5/76 


10 55.? 


) 1.419 


0.12 0 


1 1/79 


294 


0. 408 


0.034 


6/76 


1108 


1. 539 


0.12 6 


12/79 


275 


0. 370 


0.031 


7/76 


1000 


1.344 


0.1 1 4 


1/80 


4 10 


0.551 


0.047 


8/76 


867 


1„ 177 


0.100 


2/80 


367 


0.527 


0.042 


9/76 


640 


0.889 


0.07 3 


3/80 


541 


0.727 


0.062 


10/7 6 


422 


0.5 67 


0,04 8 


4/80 


482 


0. 669 


0.055 


11/76 


340 


0.4 72 


0.039 


5/80 


299 


0.402 


0.034 


12/76 


260 


0.349 


0.030 


6/80 


449 


0.6 24 


0.051 


1/77 


188 


0. 253 


0.02 1 


7/80 


418 


0.562 


0.048 


2/77 


290 


0.4 32 


0.03 3 


8/80 


216 


0. 290 


0.025 


3/77 


444 


0.5 97 


0.05 1 


9/80 


214 


0, 297 


0.024 


4/77 


390 


0.5 42 


0.04 4 


10/80 


230 


0.309 


0.026 


5/77 


280 


0.376 


0.032 


11/80 


361 


0.501 


0.04 1 


6/77 


3 20 


0.4 44 


0.03 6 


12/80 


377 


0. 507 


0.043 


7/77 


260 


0.349 


0.029 


1/81 


487 


0.655 


0.055 


8/77 


274 


0,368 


0.03 1 


2/81 


628 


0. 935 


0.072 


9/77 


212 


0.2 94 


0.024 


3/81 


500 


0.672 


0.057 


10/77 


2 80 


0,376 


0.03 2 


4/81 


537 


0.746 


0.061 


1 1/77 


340 


0.4 72 


0.039 


5/81 


386 


0. 5 19 


0.044 


12/77 


368 


0-4 95 


0.04 2 


6/81 


321 


0. 446 


0.037 


1/78 


718 


0,965 


0.082 


7/81 


492 


0. 66 1 


0.056 


2/78 


480 


0,714 


0.05 5 


8/81 


656 


0.882 


0.075 


3/78 


4 20 


0.565 


0.04 8 


9/81 


73 


0. 101 


0.008 


4/78 


410 


0.569 


0.04 7 


10/81 


570 


0.766 


0.065 


5/78 


290 


0. 3 90 


0.03 3 


1 1/31 


416 


0.578 


0.047 


6/78 


290 


0.403 


0.03 3 


12/81 


352 


0. 473 


0.040 


7/78 


360 


0.4 84 


0.04 1 


1/82 


830 


1.116 


0.095 


8/78 


360 


0.484 


0.04 1 
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TABLE XVI Project A Curves 
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VA»Ull^* 2 T|Mf;i yPffSliS Vfl»MAtUE I MANMIH 5YHBUL=A 

VAOI^,^LM 2 TMFT V^W^US VAHIAULF ^ PHAMTH SYMROL = t 

VAriU-ilF 2 IIMET VFRSU?; VAFIABtt ^ PRMAIN SYMBUL^/^ 
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VA«|/->1F 2 TMFT VCH<;‘)S VAPIAhLF I MANMTri SVMbOL*A 

VAPIA^UF ? TiMfT VEPSUS VA'^IAliLF 1 PMAMFH SYMBOL^* 

VAWIAUF 2 IIMIT VtAStlS VARIAULE 4 PKMAIN SYKUUL*# 
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TABLE XIX^ Project^ D Cuj'ves 
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TABLE XX Combined Project A-D Curves 
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